MPI-INF Logo
Campus Event Calendar

Event Entry

New for: D1, D2

What and Who

Information Retrieval on the World Wide Web

Dr. M. Henzinger
DIGITAL Systems Research Center
Informatik-Kolloquium
AG 1, AG 2  
AG Audience

Date, Time and Location

Saturday, 27 June 98
09:00
60 Minutes
45 - FB14
HS001
Saarbrücken

Abstract

Information retrieval is the process of helping users find, use, and understand information in a given document collection in order to satisfy an information need. On the web, conventional information retrieval techniques do not work as well as for conventional document collections, because of the large size of the web and the high degree of variety in document quality.

We discuss two information retrieval problems on the web and present novel solutions. First, we discuss the ranking problem, namely, the problem of ordering the web pages that are retrieved by a search engine by decreasing order of relevance to the user query. Second, we discuss the similarity problem, namely, the problem of finding web pages that are similar to a
given page. For both problems we present new algorithms, which for the first time combine connectivity analysis with content analysis. Connectivity analysis, which uses information about the hyperlink structure of the web, is based on graph algorithms; content analysis, which uses information about the contents of web pages, is based on conventional information retrieval techniques. According to a user study, our ranking algorithm increases the precision at 10 (i.e., the number of relevant pages within the first 10 pages) by 45% over the algorithms
currently in use.

Contact

Christa Schaefer
--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

Einladung zum Kolloquium des Fachbereichs Informatik


Am Samstag, den 27. Juni 1998, spricht um 9:00 Uhr in Hörsaal 001, Gebäude 45

Frau Dr. (Princeton University) M. Henzinger
DIGITAL Systems Research Center
http://www.research.digital.com/SRC/personal/Monika_Henzinger/home.html


über das Thema:
Information Retrieval on the World Wide Web




Abstract:

Information retrieval is the process of helping users find, use, and understand information in a given document collection in order to satisfy an information need. On the web, conventional information retrieval techniques do not work as well as for conventional document collections, because of the large size of the web and the high degree of variety in document quality.

We discuss two information retrieval problems on the web and present novel solutions. First, we discuss the ranking problem, namely, the problem of ordering the web pages that are retrieved by a search engine by decreasing order of relevance to the user query. Second, we discuss the similarity problem, namely, the problem of finding web pages that are similar to a
given page. For both problems we present new algorithms, which for the first time combine connectivity analysis with content analysis. Connectivity analysis, which uses information about the hyperlink structure of the web, is based on graph algorithms; content analysis, which uses information about the contents of web pages, is based on conventional information retrieval techniques. According to a user study, our ranking algorithm increases the precision at 10 (i.e., the number of relevant pages within the first 10 pages) by 45% over the algorithms
currently in use.



Interessenten/innen sind zum Vortrag herzlich eingeladen.

Die Dozenten/innen des Fachbereichs Informatik

Die Kolloquiumsankündigungen können auch unter http://www.cs.uni-sb.de/kolloquien/ gele-
sen werden.