Title:Adapting sentiment analysis resources and methods to the realm of the Polish language
Speaker:Joanna Biega
coming from:University of Wrocław – Poland
Speakers Bio:Masters Student
Event Type:PhD Application Talk
Date, Time and Location
Date:Monday, 10 February 2014
Duration:90 Minutes
Building:E1 4
With the advent of social media, sentiment analysis has become a dynamic and growing field of research, being useful in commerce, business and politics. A lot of effort has been devoted to developing frameworks for analyzing opinions, most of which, however, focus on texts written in English. Such frameworks are often not trivially adaptable to the realms of other languages. Since Polish is one of the resource-scarce languages with respect to sentiment analysis, we craft the essential toolkit and adapt the methods to bridge the gap between sentiment analysis in Polish and English.

We begin with harvesting two corpora in Polish from the domains of mobile phone and book reviews, propose a simple automatic annotation method, and manually evaluate the annotation accuracy on a sample. Then, we compare methods of sentiment lexicon acquisition by automatic translation of similar English resources. We also propose a sentiment reversal model for opinion words and evaluate its usefulness using a basic rulebased classifier on a subset of corpus. Finally, we evaluate the influence of different text pre-processing methods and feature sets on the accuracy of sentiment classification in Polish, and show how the baseline bag-of-words model can be improved by incorporation of semantic features based on the harvested sentiment lexicon and the Polish WordNet.
Our endeavor delivers a reference and a new suite of tools that enables sentiment analysis also for Polish.

