MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Adapting Sentiment Analysis to the Challenges of Social Media

Subhabrata Mukherjee
Fachrichtung Informatik - Saarbrücken
PhD Application Talk
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Monday, 25 February 2013
11:00
90 Minutes
E1 4
R024
Saarbrücken

Abstract

In this work, we investigate how sentiment analysis needs to be adapted for social media platforms like the review-blogs and the micro-blogs. Review classification differs from traditional text classification due to the involvement of the author perspective, requirement of extensive world knowledge, informal language form and the concept of sentiment aggregation from multiple weighted facets.

Authors may have different perspectives while writing a review. Though two authors may give the same overall rating to a review, they may have different preferences for individual facets. For this, we develop a joint author topic sentiment model, where we view the review rating and the author to have a distribution over their individual topic preferences and sentiment associated to them which generate the review words. The author may have a mixture of opinions about various facets, but it is his rating for the most important ones that gives shape to his overall opinion. To realize this, we pose the sentiment rating to be a function of his sentimentontology tree, depicting the relation between various features and their weights, which is learnt automatically from ConceptNet.
Feature specific sentiment analysis forms an important concept for this module, where we analyze the association between words, forming an opinion expression about a specific feature, using dependency parsing. Review classification involves extensive world knowledge to distinguish the objective facts about a product from the subjective opinions of the author. We present an approach to incorporate world knowledge in a sentiment analysis system through WikiSent, which harvests information from Wikipedia to create a topic specific, extractive summary of a review.
Apart from structured reviews, we also study the characteristics of the micro-blogs and how traditional NLP tools need to be adapted for the same. We propose a lightweight method of incorporating discourse information, in a bag-of-words model, for noisy and unstructured text like the Tweets. The traditional parsing approach for discourse processing does not work well in Twitter due to the noise and syntactic discontinuity. We present a multi-stage system, TwiSent, for sentiment analysis in Twitter. It addresses the issues of spam, noisy text, pragmatics and entity specificity in Twitter.
We perform a number of experiments, in the movie review domain, travel review domain and the micro-blog Twitter to validate our claims. We achieve better accuracies than the state-of-the-art systems in many of these experiments.

Contact

IMPRS Office Team
0681 93251800
--email hidden
passcode not visible
logged in users only

Stephanie Jörg, 02/22/2013 12:17
Stephanie Jörg, 02/22/2013 12:16 -- Created document.