MPI-INF Logo
Campus Event Calendar

Event Entry

New for: D1, D2, D3, D4, D5

What and Who

Learning n-ary Queries in Trees for Web Information Extraction

Joachim Niehren
INRIA Futurs, LIFL, Lille, France
Logik-Seminar
AG 1, AG 2, AG 3, AG 4, AG 5, SWS  
Expert Audience
English

Date, Time and Location

Wednesday, 26 July 2006
14:15
-- Not specified --
E1 4
024
Saarbrücken

Abstract

In the first part, we discuss the W3C standard query language
XPath 2.0 for completeness with respect to first-order logic.
We continue with regular n-ary queries definable by monadic
second-order logic or tree automata.

In the second part, we present algorithms for learning n-ary node
selection queries in trees from completely annotated examples.
We assumes that queries are represented by deterministic tree
automata. We show that polynomially bounded n-ary queries in trees
can be learned from completely annotated examples, with respect
to Gold's learning model from polynomial time and data.

In the third part, we illustrate an application to Web information
extraction. We demo the Squirrel system that learns monadic
queries in HTML documents from partially annotated examples.

Contact

--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

Das Logikseminar ist eine gemeinsame Veranstaltung des DFKI, des MPI
und der Fachrichtungen Informatik, Philosophie und Rechtswissenschaft.
Vortragswünsche bitte an Uwe Waldmann, MPI, Tel.: (0681) 9325-205,
uni-intern: 92205

Uwe Brahm, 04/12/2007 12:48
Veronika Weinand, 07/19/2006 17:00 -- Created document.