subtrees that contain the keywords and show how they are connected.
Enumerating these subtrees should be done efficiently, correctly and
in ranked order. Efficiency means that the top-k answers are obtained
quickly. Correctness is crucial in order not to miss any of the top-k
answers. Ranked order (or at least an approximation thereof) is
important, because the number of results could be huge (i.e.,
exponential in the size of the data even if the number of keywords is
fixed). This talk demonstrates why achieving all of the above
properties is a subtle task and describes theoretical results showing
that it can be done.
The second part of the talk describes paradigms of flexible querying
that enable users to pose queries over heterogeneous types of data
(e.g., XML, relations, the semantic web, etc.). Flexible tree queries
allow users to describe the relationships among entities in loose
terms. Furthermore, they handle missing information gracefully and the
results can be obtained efficiently in ranked order.