Campus Event Calendar: Silke Trißl (08/03/2007 in E1 4/433 (Rotunda 4th floor))

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Scoring Search Results in the Presence of Overlapping Data Sources

Silke Trißl

Humbold-Universität zu Berlin

Talk

http://www2.informatik.hu-berlin.de/~trissl/

AG 1, AG 3, AG 5, RG2, AG 2, AG 4, RG1, SWS

AG Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Friday, 3 August 2007

14:00

90 Minutes

E1 4

433 (Rotunda 4th floor)

Saarbrücken

Abstract

Data integration projects in the life sciences often gather data on a particular subject from multiple sources. Some of these sources overlap to a certain degree. Therefore, integrated search results may be supported by one, few, or all data sources. To reflect these differences, results of a query should be ranked according to the number of data sources that support them.

In my talk I will discuss how such a ranking scheme should look like, as it is not clear per se. Either, results supported by only few sources are ranked high because this information is potentially new, or such results are ranked low because the strength of evidence supporting them is limited. We defined a surprisingness score, preferring results supported by few sources, and a confidence score, preferring frequently encountered information. Unlike many other scoring schemes our proposal is purely data-driven and does not require users to specify preferences among sources. Both scores take the concrete overlaps of data sources into account and do not presume statistical independence. In my talk I will present some results using the Columba database. Columba is an integrated database on protein structures from the PDB and their annotations such as fold, function, or sequence.

Contact

Ralf Schenkel

+49 681 9325 504

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Ralf Schenkel, 07/26/2007 12:55 -- Created document.