Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:On Leveraging the Wisdom of Crowdsourced Experts
Speaker:Muhammad Bilal Zafar
coming from:Max Planck Institute for Software Systems
Speakers Bio:
Event Type:SWS Student Defense Talks - Qualifying Exam
Visibility:SWS
We use this to send out email in the morning.
Level:Expert Audience
Language:English
Date, Time and Location
Date:Wednesday, 24 June 2015
Time:13:00
Duration:60 Minutes
Location:Saarbr├╝cken
Building:E1 5
Room:029
Abstract
Recently, online social networks (OSNs) such as Twitter and Facebook have emerged as popular platforms for exchanging information on the Web. With hundreds of millions of users posting content ranging from everyday conversations to real-time news and from product reviews to information about a diverse range of topics like politics and diabetes, OSNs truly provide a way to tap into the wisdom of crowds. In fact, content streams generated by Twitter users are being used for several applications like content search and recommendation, breaking news detection and business analytics. However, analyzing content streams generated on Twitter poses two important research challenges: (1) processing a large stream of around 500 million tweets per day in real-time is often not scalable and (2) since OSN users have pseudo-anonymous identities, reasoning about quality and trustworthiness of content generated by these users becomes increasingly difficult as shown by prior studies.

In this work, we address these two challenges by proposing a novel data sampling methodology: relying on the wisdom of crowdsourced experts. That is, instead of processing all the tweets posted in the Twitter network, we only rely on tweets from a handful of expert users. Since tweets posted by these expert users constitute only a small fraction of all the tweets in the network, using this expert tweet stream (or, expert sample) helps overcome scalability issues related to real-time data processing. Comparing the expert sample to another widely used sampling methodology (namely, random sampling) reveals that expert sampling has numerous potential advantages for data mining and content retrieval tasks such as content search, real-time event detection, product sentiment analysis etc.

To show the utility of the expert stream for content-centric applications, we compare Twitter search functionality implemented over the whole Twitter stream (or, crowd stream) to one implemented over the expert stream only. Surprisingly, despite being two orders of magnitude smaller, the expert stream captures most of the relevant information posted by the whole Twitter crowd. Moreover, search results from expert stream are of significantly better quality and contain far fewer spam posts as compared to crowd results. Our findings add another dimension to longstanding crowds vs. experts debate by concluding that wisdom of experts is better than wisdom of crowds in the context of certain content-centric applications. These findings have serious implications for the design of future content retrieval systems.

Contact
Name(s):
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):

Created by:Maria-Louise Albrecht/MPI-KLSB, 03/10/2016 02:11 PMLast modified by:Uwe Brahm/MPII/DE, 11/24/2016 04:13 PM
  • Maria-Louise Albrecht, 03/10/2016 02:14 PM -- Created document.