Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Adapting sentiment analysis resources and methods to the realm of the Polish language
Speaker:Joanna Biega
coming from:University of Wrocław – Poland
Speakers Bio:Masters Student
Event Type:PhD Application Talk
Visibility:D1, D2, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Language:English
Date, Time and Location
Date:Monday, 10 February 2014
Time:08:50
Duration:90 Minutes
Location:Saarbrücken
Building:E1 4
Room:024
Abstract
With the advent of social media, sentiment analysis has become a dynamic and growing field of research, being useful in commerce, business and politics. A lot of effort has been devoted to developing frameworks for analyzing opinions, most of which, however, focus on texts written in English. Such frameworks are often not trivially adaptable to the realms of other languages. Since Polish is one of the resource-scarce languages with respect to sentiment analysis, we craft the essential toolkit and adapt the methods to bridge the gap between sentiment analysis in Polish and English.

We begin with harvesting two corpora in Polish from the domains of mobile phone and book reviews, propose a simple automatic annotation method, and manually evaluate the annotation accuracy on a sample. Then, we compare methods of sentiment lexicon acquisition by automatic translation of similar English resources. We also propose a sentiment reversal model for opinion words and evaluate its usefulness using a basic rulebased classifier on a subset of corpus. Finally, we evaluate the influence of different text pre-processing methods and feature sets on the accuracy of sentiment classification in Polish, and show how the baseline bag-of-words model can be improved by incorporation of semantic features based on the harvested sentiment lexicon and the Polish WordNet.
Our endeavor delivers a reference and a new suite of tools that enables sentiment analysis also for Polish.

Contact
Name(s):
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):
  • Aaron Alsancak, 02/06/2014 10:29 AM -- Created document.