MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Adapting sentiment analysis resources and methods to the realm of the Polish language

Joanna Biega
University of Wrocław – Poland
PhD Application Talk

Masters Student
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Monday, 10 February 2014
08:50
90 Minutes
E1 4
024
Saarbrücken

Abstract

With the advent of social media, sentiment analysis has become a dynamic and growing field of research, being useful in commerce, business and politics. A lot of effort has been devoted to developing frameworks for analyzing opinions, most of which, however, focus on texts written in English. Such frameworks are often not trivially adaptable to the realms of other languages. Since Polish is one of the resource-scarce languages with respect to sentiment analysis, we craft the essential toolkit and adapt the methods to bridge the gap between sentiment analysis in Polish and English.

We begin with harvesting two corpora in Polish from the domains of mobile phone and book reviews, propose a simple automatic annotation method, and manually evaluate the annotation accuracy on a sample. Then, we compare methods of sentiment lexicon acquisition by automatic translation of similar English resources. We also propose a sentiment reversal model for opinion words and evaluate its usefulness using a basic rulebased classifier on a subset of corpus. Finally, we evaluate the influence of different text pre-processing methods and feature sets on the accuracy of sentiment classification in Polish, and show how the baseline bag-of-words model can be improved by incorporation of semantic features based on the harvested sentiment lexicon and the Polish WordNet.
Our endeavor delivers a reference and a new suite of tools that enables sentiment analysis also for Polish.

Contact

--email hidden
passcode not visible
logged in users only

Aaron Alsancak, 02/06/2014 10:29 -- Created document.