Collaborative Filtering Recommendation Algorithm
Sonja Kanga
Abstract: Collaborative filtering algorithm is one of the most successful technologies in personalized recommendation system. However, the traditional algorithm only focus on user ratings, regardless of user interest changes, seriously affecting the important quality. Based on the experiment, the rule of interest curve of interest change is discussed first. And then use the recently rated item to represent the users current interest; for each historical visit item, calculate the composite data weight based on the interest forgetting curve and the rating matrix; for each item without user score, based on project similarity and project integration data Weight calculation. In the calculation of project similarity, the project attribute similarity and project score similarity, more comprehensive and accurate. Experimental results show that the proposed algorithm can provide better recommendation accuracy and recall rate.
Keywords: personalized recommendation, collaborative filtering
With the development of information technology, e-commerce has become an important aspect of doing business. In order to meet the user in a large amount of information to find valuable information, the recommended system was born. Recommended system in the e-commerce platform to play the role of sales staff, to recommend products to users to help users find products, collaborative filtering technology is recommended the earliest application of one of the most successful technology, but with the site structure is complex, the number of goods and users increased The collaborative filtering recommendation system is faced with two major challenges: to improve the scalability of the collaborative filtering algorithm and to reduce the sparse data set of the recommended system. In view of these problems, an improved collaborative filtering method based on clustering collaborative filtering recommendation algorithm is proposed. Due to the convenience of the Internet, a large amount of product-related information is provided to the customer at a very low cost, which allows the user to select the product even by reference. This is called information overload. In order to solve this problem, the recommendation system has been widely used in many large e-commerce sites, to provide customers with products and services. For example, companies like Amazon.com, Netflix.com, Half.com, and CDNOW have successfully implemented commercial recommendation systems. Personalized service systems typically use two main technologies: content-based filtering and collaborative filtering (CF). Content-based filtering approaches provide recommendations by comparing the similarity between projects and the users interest in the same feature space. In contrast, collaborative filtering is based on their previously expressed preferences and recommendations from other similar users. Collaborative filtering (CF) is one of the most promising recommendations. The traditional collaborative filtering system is based on the user. Firstly, the users similarity degree is calculated according to the historical rank matrix of the project. Then, the similarity degree is sorted and the largest M sets of nearest neighbor sets are selected. Then, the similarity degree is used as the weight to calculate the results of M nearest neighbors on the target project , To obtain the users assessment level, and finally through the evaluation level recommended items. However, in the traditional CF algorithm, the users interest is considered to be static. This means that the ratings generated at different times are equal weights and do not take into account the changes in user buying interest. As a result, when the users interest changes, the system may recommend items that do not meet the criteria. In order to solve this problem, in some researchers work, the time factor has been brought into some improved CF algorithm. In the e-commerce system, many types of goods, under normal circumstances the user only to browse or buy a specific category of goods, and to their concern category of goods, the users score may just represent them on a certain type of project Of the preferences, so you can think of the same category of items concerned about the user interest is the same. In the traditional similarity measure method, the user does not consider the concern of the category of information, calculate the user similarity of the data source is single, resulting in the recommendation results are not high accuracy. When using traditional approach to similarity measures, information loss is caused by ignoring items that do not have a common score or a small number of common scoring projects. However, using the category information that the user is of common interest can explore the association between the users and increase the way the user similarity measure. Collaborative filtering has been widely used to solve many practical problems. Learning effective potential factors play the most important role in collaborative filtering. The traditional CF method based on matrix decomposition technology learns the potential factors from the user project rating, and encounters the cold start problem and the sparse problem. Some improved CF methods enrich the pioneers potential by using auxiliary information as a regularization. However, due to the sparseness of ratings and information, the potential factors of learning may not be very effective. In order to solve this problem, we study the effective potential character through in-depth study. In many applications, the depth learning model has become a very attractive learning effective representation. In particular, by combining matrix decomposition with depth feature learning, we propose a general deep architecture of CF. We combine the probabilistic matrix decomposition with th
剩余内容已隐藏,支付完成后下载完整资料
基于协同过滤推荐算法
摘要:协同过滤算法是个性化推荐系统中最成功的技术之一。然而,传统算法只关注用户评级,不考虑用户兴趣的变化,严重影响重要质量。基于实验,首先探讨了用户兴趣变化兴趣遗忘曲线的规律。然后,使用最近评级的项目来表示用户当前的兴趣;对于每个历史访问项目,基于兴趣忘记曲线和评级矩阵计算综合数据权重;对于没有用户得分的每个项目,基于项目相似度和项目集成数据权重计算预测。在计算项目相似性的同时,综合了项目属性相似性和项目分数相似度,更为全面,准确。实验结果表明,提出的算法可以提供更好的推荐精度和回忆率。
关键词:个性化推荐,协同过滤
随着信息技术的发展,电子商务已成为做生意的一个重要方面。为了满足用户在大量信息中找到有价值的信息,推荐的系统诞生了。 推荐系统在电子商务平台上发挥销售人员的作用,向用户推荐产品,帮助用户找到产品,协同过滤技术是推荐系统最早应用最成功的技术之一,但是与现场 结构复杂,商品和用户数量的增加,协同过滤推荐系统的发展面临两大挑战:提高协同过滤算法的可扩展性,减少推荐系统稀疏的数据集,针对这些问题,提出了一个改进的协同过滤方法 - 基于群集的协同过滤推荐算法。由于互联网的便利性,以非常低的成本为客户提供了大量与产品相关的信息,这使得用户甚至根据其参考来选择产品。这被称为信息超载。为了解决这个问题,推荐系统已被广泛应用于许多大型电子商务网站,向潜在客户提供产品和服务。例如,像Amazon.com,Netflix.com,Half.com和CDNOW这样的公司已经成功实施了商业推荐系统。个性化服务系统通常采用两种主要技术:基于内容的过滤和协同过滤(CF)。基于内容的过滤方法通过比较项目之间的相似性和用户对同一特征空间的兴趣来提供建议。相比之下,协同过滤方法基于他们以前表达的偏好和其他类似的用户提出的建议。协同过滤(CF)是最有希望的推荐技术之一。传统的协同过滤系统是基于用户的。首先,根据用户对项目的历史等级矩阵计算用户的相似度;然后对相似度进行排序,选择最大的M个构造最近邻集;然后以相似度作为权重计算目标项目上的M个最近邻居的成绩,以获得用户的评估等级,最后通过评估等级推荐项目。然而,在传统的CF算法中,用户的兴趣被认为是静态的。这意味着,在不同时间产生的评级是平等的权重,并且不考虑用户购买兴趣的变化。因此,当用户的兴趣发生变化时,系统可能会推荐不符合条件的项目。为了解决这个问题,在一些研究人员的工作中,时间因素已经被带入了一些改进的CF算法。在电子商务系统中,商品的种类很多,一般情况下用户只浏览或购买特定类别的商品,并给自己关注类别的商品打分,用户所给出的评分可能恰恰就代表他们在某一类项目上的偏好,因此可以认为关注相同类别项目的用户兴趣是相同的。而在传统的相似性度量方法中,没有考虑用户对类别的关注信息,计算用户相似性的数据源单一,导致推荐结果准确性不高。在使用传统的相似性度量方法时,由于忽略没有共同评分的项目或共同评分很少的项目,会造成信息丢失。然而,利用用户共同关注的类别信息可以挖掘用户之间存在的关联,增加了用户相似性度量的途径。协同过滤已被推荐系统广泛应用于解决许多现实问题。学习有效的潜在因素在协同过滤中起着最重要的作用。基于矩阵分解技术的传统CF方法从用户项目评级中学习潜在因素,并遇到冷启动问题以及稀疏问题。一些改进的CF方法通过将辅助信息作为正则化来丰富先驱者的潜在因素。然而,由于评级和旁边信息的稀疏性,学习的潜在因素可能不是很有效。为了解决这个问题,我们通过深入的学习来学习有效的潜在表征。在许多应用中,深度学习模型已经成为学习有效表示的非常有吸引力的。特别地,通过将矩阵分解与深度特征学习相结合,我们提出了CF的一般深层架构。我们通过将概率矩阵分解与边缘化去噪堆叠自动编码器相结合,提供了我们的架构的自然实例。组合框架导致了对潜在特征的简约拟合,如与电影/图书推荐和响应预测任务的四个大型数据集相比,其改进的性能与现有技术模型相比较。
个性化推荐技术可以直接或间接跟踪用户行来在线地挖掘自己的喜好,为建议预测他们的兴趣,这是一个非常有效的办法来解决信息过载问题。协同过滤(CF)是推荐系统中最成功的技术之一。CF可以推荐信息,只根据用户项目评分矩阵,并为用户找到新的偏好,不需要思考详细的项目信息。协同过滤(CF)的主要思想是系统为一个活跃用户发现兴趣相同的用户,使用他们对信息资源的意见,在活跃用户需要一个建议时生成一个建议。这种方法已被证明在多个领域是非常成功的,特别是多值评分数据的领域。协同过滤的研究始于基于内存的方法,这个方法利用整个用户-项目数据库,基于用户或项目的相似性生成预测。已研究了两种基于内存的方法:基于用户的和基于项目的协同过滤。基于用户的方法首先寻找一些与活跃用户有着相似评分风格的相似用户,使用这些相似用户的评分来为活跃用户的评分做预测。基于项目的方法与基于用户的方法想法相同。唯一的区别是基于用户的方法,试图一个为活跃用户找到相似用户,但基于项目的方法,试图为每个项目找到相似项目。因此,我们可以看到,在协同过滤推荐中非常关键的一步是用户或项目之间的相似度计算。然而,随着系统的不断扩大,数据稀疏成为一个严重的问题。一般来说,信息资源的数量是巨大的,与用户的意见占全部项目的一个小比例,从而导致稀疏的用户项目评分矩阵。在这种情况下,使用传统的协同过滤方法很难准确找到相似用户,这导致少得可怜的建议数量。为了解决这个问题,文献提出,使用称为奇异值分解的方法来减少推荐系统数据库维数,这样用户对降维的每一个用户都有评分。但参考文献的研究表明,这种方法可能会导致信息丢失,其影响非常依赖于数据。在高维情况下,降维不能取得良好的业绩。参考文献提出了一种生成的概率框架,为了将要做出的推荐,通过混合所有的预测值评分,利用更多的用户-项目矩阵提供的可用数据。参考文献提出了一种协同过滤的框架,将基于内存的方法和基于模型的方法的长处结合起来,通过引入基于平滑的方法,并预测用户项目矩阵中的所有丢失的数据,减轻数据稀疏的问题。然而,基于集群的平滑算法限制了每个群集的用户多样性,为活跃用户的建议而做的用户-项目矩阵中的所有丢失数据的预测可能带来的负面影响。一般情况下,上述方法不能得到高性能的建议。
传统的协同过滤算法根据用户评级计算项目相似度或用户相似度,并根据这些相似度预测项目,不能将用户兴趣的变化纳入考虑。事实上,最近评级项目可以更好地代表用户目前的利益,具有强大的项目推荐能力,相反,早期评级的项目与用户当前的兴趣较不相似,因此差。协同过滤推荐系统已经被广泛应用于电子商务领域,随着用户和产品数量的不断增加,传统算法的许多缺点逐渐暴露了出来。通过对传统的相似性计算方法进行改进,提出了一种结合项目类别信息的协同过滤推荐算法,这种方法提高了邻居用户搜索的准确性,同时也避免了传统的用户相似性度量方法中存在的弊端。实验结果表明,在用户评分数据极端稀疏的情况下,该算法显著的提高了推荐系统的推荐质量。下一步的工作是进一步对算法进行改进,并考虑利用用户档案等数据计算用户相似性,提高推荐系统的推荐质量。首先,从两个方面计算项目相似性:项目属性相似性和项目得分相似度。然后,探索了用户兴趣变化 - 兴趣忘记曲线的规律。之后,对于用户的历史访问项目,根据利益忘记曲线和评级矩阵计算了综合数据权重;对于没有用户分数的每个项目,根据项目相似度和项目集成数据权重计算推荐度,按照推荐度从高到低排序项目,并选择前N个项目构建顶层推荐集。通过实验比较,所提出的算法可以提供更高质量的建议。
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[25919],资料为PDF文档或Word文档,PDF文档可免费转换为Word
以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。