Campus Event Calendar

Event Entry

What and Who

Refresh Strategies and Online Change Estimation for Highly Dynamic Web Content

Roxana Gabriela Horincar () On Friday, Nov 23, 2012
Universite Pierre et Marie Curie, Paris
Postdoc Application Talk
AG 5  
AG Audience

Date, Time and Location

Friday, 23 November 2012
60 Minutes
E1 4


Lately, the online available web content has been getting more

and more diverse and dynamic because of the rapidly increasing

number of sources and devices connected to the Internet and the

growing success of the Web 2.0 services. In order to facilitate

the efficient dissemination of the evolutive and often temporary

information streams (news, messages, announcements), many web

applications publish their most recent information items as RSS

documents which are then collected and transformed by RSS

aggregators like Google Reader or Yahoo! News.

This talk will discuss the general problem of efficiently

crawling highly dynamic web data in the context of content-based

feed aggregation systems. More precisely, I introduce an optimal

best-effort refresh strategy that maximizes the feed aggregation

quality (completeness and freshness) and that takes into account

the feed saturation process. Furthermore, I analyze the

characteristics of a representative collection of real-world RSS

feeds focusing on their temporal dimension. And last, I study

different online change estimation techniques and present how

they integrate with the feed refresh strategy.


Gerhard Weikum
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 11/20/2012 09:33 -- Created document.