Campus Event Calendar: Stefanie Scherzinger (08/23/2007 in E1 4/433)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Efficient Main Memory-based XML Stream Processing

Stefanie Scherzinger

Fachrichtung Informatik - Saarbrücken

Vortrag

AG 5

AG Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Thursday, 23 August 2007

14:00

45 Minutes

E1 4

433

Saarbrücken

Abstract

XML data management is by no means restricted to XML databases, as there are manifold applications that process XML documents as files or streams. These applications are naturally main-memory based, which makes main memory the bottleneck for scalability. The streaming evaluation of XPath queries has been worked on extensively in the past, and state-of-the-art techniques use very little memory. However, for data transformations with real-world XQueries, as opposed to just XPath filtering, the need for substantial main memory buffers cannot be avoided in general. An important task is thus to devise a well-principled machinery for XQuery processing that is parsimonious with resources and treats buffering as an optimization target. We present a toolkit for effective buffer management in XQuery processors. Our toolkit comprises three compatible buffer management techniques. First, in the spirit of XML document projection, we only load data that is relevant to query evaluation into main memory buffers. We present a novel implementation that takes string matching algorithms designed for efficient keyword search in flat strings into the second dimension, to navigate in tree-structured data. Different from existing prefiltering schemes, we usually process only fractions of the input and get by with very economical consumption of both main memory and processing time. Second, we introduce an extension of the XQuery language, called FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constraints from a DTD to schedule event handlers and to thus minimize the amount of buffering required for evaluating a query. Third, we purge buffers preemptively from data that is no longer relevant to query evaluation. This is accomplished by extending garbage collection techniques originally developed for programming languages to operate on tree-structured data. By the combined effort of static and dynamic analysis, we timely purge buffers and effectively reduce their size. Our prototype implementations and extensive experiments demonstrate the efficiency of our techniques. To compare our contributions to related work in a systematic manner, we introduce an abstract framework for XML stream processing. Our model captures XML stream tokens as well as the current buffer contents, and lets us model different architectures for XQuery evaluation. Using our framework, we show the interoperability of our tools with existing XQuery optimization techniques.

Contact

Petra Schaaf

500

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Petra Schaaf, 07/23/2007 13:16 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis