MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Breast cancer prediction on mammography images

Claudia Perlich
IBM Research
Talk

Claudia Perlich has received her M.Sc. in Computer Science from Colorado University at Boulder, Diplom in Computer Science from Technische Universitaet in Darmstadt, and her Ph.D. in Information Systems from Stern School of Business, New York University. Her Ph.D. thesis concentrated on probability estimation in multi-relational domains that capture information of multiple entity types and relationships between them. Her dissertation was recognized as an additional winner of the International SAP Doctoral Support Award Competition and her submission placed second in the yearly data mining competition in 2003 (KDD-Cup 03).
Claudia joined the Data Analytics Research group as a Research Staff Member in October 2004. Her research interests are in machine learning for complex real-world domains including marketing, finance and medicine. She and her team have been very successful in data mining competitions. Her recent wins include KDD CUP 2007, 2008 and 2009.
AG 1, AG 4, RG1, MMCI, AG 3, AG 5, SWS  
AG Audience
English

Date, Time and Location

Tuesday, 23 June 2009
12:30
60 Minutes
E1 4
024
Saarbrücken

Abstract

The KDD CUP 2008 was organized by Siemens Medical Solutions ( http://www.kddcup2008.com/ ). They provided mammography based data for around 1700 patients. Siemens used proprietary software to extract from the original digital image data candidate regions and to characterize such regions in terms of 117 normalized numeric features with unknown interpretation. Task 1 was the identification of malignant candidate regions in mammography pictures with a ranking-based evaluation measure similar to ROC. Task 2 required submitting the longest list of healthy patients. Any submission with even one false negative was disqualified. Our winning submission to both tasks exploited a) the properties of the evaluation metrics to improve the model scores from of a linear SVM and b) some form of data leakage that resulted in predictive information in the patient identifiers.

Contact

Alice McHardy
--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

Cancer, Image Processing

Uwe Brahm, 06/22/2009 11:24
Uwe Brahm, 06/22/2009 11:19 -- Created document.