少数人的智慧

郑昀@玩聚SR 20091105

一、冷启动

Greg Linden针对最新的一篇论文："The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF，即《少数人的智慧：基于网络专家意见的协同过滤研究》) 做了如下点评：

“

What they do say is that using a very small pool of experts works surprisingly well.

论文说的是，用很小一个专家池，推荐效果惊人地好。

In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system.

我认为它为一个推荐系统的自启动指出了一个很好的替代选择。

If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

”

即，选择一个高质量专家池，可以是你组建的团队，也可以是你选中的专家群，即使是相当小的一个群体，你的推荐系统也会有一个非常好的开端。少数人的智慧，此时此刻，可以解决推荐系统的冷启动问题。这也是玩聚SR最开始选择Experts Pool作为起源，一上来就有很好信息过滤器效果的原因。

二、论文的摘要：

为了方便理解，下面意译一下该论文：

最近邻协同过滤（Nearest-neighbor collaborative filtering）是一个很有效的推荐方法。但它总受困于这几个问题：

数据稀疏和噪音；冷启动问题（cold-start）；可扩展性问题。

所以论文作者提出一个新方法，一个传统协同过滤方法的变种：

并不是对用户打分数据（User-rating data）实施最近邻算法，而是用一个专家邻居（expert neighbors）集合作为比对样本，去计算这批人与目标用户的相似度。

这个方法至少没有太大可扩展性问题，相当于缩小了比对的基准集合。最近邻原方法可近似理解为做两两比对，计算肯定花时间，而且当新用户（尤其是某某观光团的到来会让数据噪音多得一塌糊涂）比比皆是时，没有几条数据能够让你进行相似性计算。

作者定义专家为，在给定领域，能够产生思虑周全的、始终如一的和可靠的评估（评分）、我们可信任的独立个体。

（原文：

We define an expert as an individual that we can
trust to have produced thoughtful, consistent and reliable
evaluations (ratings) of items in a given domain.

）

我们比较关注论文作者们的以下两个探讨问题的角度：

(a) study how preferences of a large population can be pre-
dicted by using a very small set of users;

研究用一小群用户去预测海量用户到底有多大的可参考价值；

如果这几个角度是可行的话，那么实际上并不需要拿到一个海量用户社区的所有数据，只要锁定Experts Pool即可为用户进行推荐。

附录：

Greg Linden在被封的BlogSpot的原文如下：

Wednesday, November 04, 2009

Using only experts for recommendations

A recent paper from SIGIR, "The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF), has a very useful exploration into the effectiveness of recommendations using only a small pool of trusted experts.
The results suggest that using a small pool of a couple hundred experts, possibly your own experts or experts selected and mined from the web, has quite a bit of value, especially in cases where big data from a large community is unavailable.
A brief excerpt from the paper:

Recommending items to users based on expert opinions .... addresses some of the shortcomings of traditional CF: data sparsity, scalability, noise in user feedback, privacy, and the cold-start problem .... [Our] method's performance is comparable to traditional CF algorithms, even when using an extremely small expert set .... [of] 169 experts.
Our approach requires obtaining a set of ... experts ... [We] crawled the Rotten Tomatoes web site –- which aggregates the opinions of movie critics from various media sources -- to obtain expert ratings of the movies in the Netflix data set.

The authors certainly do not claim that using a small pool of experts is better than traditional collaborative filtering.
What they do say is that using a very small pool of experts works surprisingly well. In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system. If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

时间： 2025-01-24 18:07:32

少数人的智慧

Wednesday, November 04, 2009

Using only experts for recommendations

少数人的智慧的相关文章

期待一个属于所有人的智慧城市！

IBM林世伟：智慧属于人，不属于机器！

[转载]Flash为客户端的多人网络游戏的实现

TalkingData 崔晓波：信仰数据的人做的决策才是数据驱动的

那些阿里人写过的书

常见的各种人提出的理论

李成名博士:智慧城市的时空信息云平台

2016智慧城市建设走向新时代应用PPP模式抓住四点

智慧城市建设应坚持人文本位