MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

On Leveraging the Wisdom of Crowdsourced Experts

Muhammad Bilal Zafar
MMCI
SWS Student Defense Talks - Qualifying Exam
SWS  
Expert Audience
English

Date, Time and Location

Wednesday, 24 June 2015
13:00
60 Minutes
E1 5
029
Saarbrücken

Abstract

Recently, online social networks (OSNs) such as Twitter and Facebook have emerged as popular platforms for exchanging information on the Web. With hundreds of millions of users posting content ranging from everyday conversations to real-time news and from product reviews to information about a diverse range of topics like politics and diabetes, OSNs truly provide a way to tap into the wisdom of crowds. In fact, content streams generated by Twitter users are being used for several applications like content search and recommendation, breaking news detection and business analytics. However, analyzing content streams generated on Twitter poses two important research challenges: (1) processing a large stream of around 500 million tweets per day in real-time is often not scalable and (2) since OSN users have pseudo-anonymous identities, reasoning about quality and trustworthiness of content generated by these users becomes increasingly difficult as shown by prior studies.


In this work, we address these two challenges by proposing a novel data sampling methodology: relying on the wisdom of crowdsourced experts. That is, instead of processing all the tweets posted in the Twitter network, we only rely on tweets from a handful of expert users. Since tweets posted by these expert users constitute only a small fraction of all the tweets in the network, using this expert tweet stream (or, expert sample) helps overcome scalability issues related to real-time data processing. Comparing the expert sample to another widely used sampling methodology (namely, random sampling) reveals that expert sampling has numerous potential advantages for data mining and content retrieval tasks such as content search, real-time event detection, product sentiment analysis etc.

To show the utility of the expert stream for content-centric applications, we compare Twitter search functionality implemented over the whole Twitter stream (or, crowd stream) to one implemented over the expert stream only. Surprisingly, despite being two orders of magnitude smaller, the expert stream captures most of the relevant information posted by the whole Twitter crowd. Moreover, search results from expert stream are of significantly better quality and contain far fewer spam posts as compared to crowd results. Our findings add another dimension to longstanding crowds vs. experts debate by concluding that wisdom of experts is better than wisdom of crowds in the context of certain content-centric applications. These findings have serious implications for the design of future content retrieval systems.

Contact

--email hidden
passcode not visible
logged in users only

Maria-Louise Albrecht, 03/10/2016 14:14 -- Created document.