has focused on problems like classification and regression, where the
prediction is a single univariate variable. But what if we need to
predict complex objects like trees, orderings, or alignments? Such
kinds of predictions are crucial in a variety of information access
and retrieval problems, for example, when a natural language parser
needs to predict the correct parse tree for a given sentence, when one
needs to optimize a text classification rule to a multivariate
performance measure like the F1-score, or when predicting the
alignment between two sentences in different languages.
This talk discusses a support vector approach and algorithm for
predicting such complex objects. It generalizes conventional
classification SVMs to a large range of structured outputs and
multivariate loss functions. While the resulting training problems
have exponential size, there is a simple algorithm that allows
training in polynomial (or in some cases linear) time. The algorithm
is implemented in the SVM-Struct software and empirical results will
be given for several examples.
Bio: Thorsten Joachims is an Assistant Professor in the Department of
Computer Science at Cornell University. In 2001, he finished his
dissertation with the title "The Maximum-Margin Approach to Learning
Text Classifiers: Methods, Theory, and Algorithms", advised by
Prof. Katharina Morik at the University of Dortmund. From there he
also received his Diplom in Computer Science in 1997 with a thesis on
WebWatcher, a browsing assistant for the Web. From 1994 to 1996 he
was a visiting scientist at Carnegie Mellon University with Prof. Tom
Mitchell. His research interests center on a synthesis of theory and
system building in the field of machine learning, with a focus on
Support Vector Machines and machine learning with text. He authored
the SVM-Light algorithm and software for support vector learning.