Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Exploiting Weak Supervision in NLP tasks: Application to Sentiment Summarization
Speaker:Ivan Titov
coming from:University of Illinois
Speakers Bio:
Event Type:Talk
Visibility:D1, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Date, Time and Location
Date:Thursday, 12 March 2009
Duration:45 Minutes
Building:E1 4
In recent years, most of the research in structured prediction in NLP (e.g., parsing, segmentation and extraction problems) has been focused on supervised methods requiring large amounts of labeled data.

Constructing such datasets is very expensive and time consuming. However, for many tasks it is possible to obtain abundant amounts of unlabeled content annotated with labels correlated with required structured predictions. Examples of such correlated labels include titles of documents and topic tags for text segmentation, sentiment scores and helpfulness ratings for summarization. Abundance of such weakly supervised data opens an interesting line of research: designing models leveraging these labelings to tackle a wide variety of NLP problems.

In this talk I will be considering the sentiment summarization problem. I will present statistical models which exploit user generated numerical aspect ratings to discover corresponding topics and are therefore able to extract fragments of text discussing these aspects without the need of annotated data. I will also discuss implications to other NLP problems, generalization performance of the proposed methods, and important open research questions.

Name(s):Conny Liegl
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Attachments, File(s):
  • Conny Liegl, 03/09/2009 02:14 PM -- Created document.