当前位置: X-MOL 学术Am. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Rating Movies and Rating the Raters Who Rate Them
The American Statistician ( IF 1.8 ) Pub Date : 2009-11-01 , DOI: 10.1198/tast.2009.08278
Hua Zhou 1 , Kenneth Lange
Affiliation  

The movie distribution company Netflix has generated considerable buzz in the statistics community by offering a million dollar prize for improvements to its movie rating system. Among the statisticians and computer scientists who have disclosed their techniques, the emphasis has been on machine learning approaches. This article has the modest goal of discussing a simple model for movie rating and other forms of democratic rating. Because the model involves a large number of parameters, it is nontrivial to carry out maximum likelihood estimation. Here we derive a straightforward EM algorithm from the perspective of the more general MM algorithm. The algorithm is capable of finding the global maximum on a likelihood landscape littered with inferior modes. We apply two variants of the model to a dataset from the MovieLens archive and compare their results. Our model identifies quirky raters, redefines the raw rankings, and permits imputation of missing ratings. The model is intended to stimulate discussion and development of better theory rather than to win the prize. It has the added benefit of introducing readers to some of the issues connected with analyzing high-dimensional data.

中文翻译:

对电影进行评级并对对其进行评级的评级者进行评级

电影发行公司 Netflix 通过提供 100 万美元的奖金来改进其电影评级系统,在统计界引起了相当大的轰动。在公开其技术的统计学家和计算机科学家中,重点一直放在机器学习方法上。本文旨在讨论一个简单的电影评级模型和其他形式的民主评级。由于模型涉及大量参数,因此进行最大似然估计并非易事。这里我们从更通用的 MM 算法的角度推导出一个简单的 EM 算法。该算法能够在充满劣质模式的似然景观上找到全局最大值。我们将模型的两个变体应用于 MovieLens 档案中的数据集并比较它们的结果。我们的模型识别古怪的评分者,重新定义原始排名,并允许对缺失的评分进行估算。该模型旨在激发更好理论的讨论和发展,而不是为了赢得奖项。它还具有向读者介绍与分析高维数据相关的一些问题的额外好处。
更新日期:2009-11-01
down
wechat
bug