MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Information-Theoretic Feature Selection for Continuous Data

Panagiotis Mandros
International Max Planck Research School for Computer Science - IMPRS
PhD Application Talk
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Monday, 26 October 2015
11:20
90 Minutes
E1 4
024
Saarbrücken

Abstract

Feature selection is a dimensionality reduction process that judiciously selects a subset of features from which to build a prediction model. In particular, it alleviates computational and high dimensionality problems by reducing the number of dimensions one has to consider. Recognizing the

importance of feature selection to data analysis, in this thesis we aim to contribute with an Exploratory Data Analysis approach the discipline of working directly with empirical data without making unnecessary assumptions (e.g. assumptions on data distributions).
With this goal in mind, we propose a parameter and assumption free supervised information-theoretic feature selection method for data with continuous attributes and discrete/nominal class label. The novelty of our method lies in working directly with continuous data by employing Cumulative Entropy, a new entropy measure designed specifically for continuous random variables. While our objective is to work directly with continuous data, we will show that for some computations we need to perform discretization a process converting continuous data to discrete one. In such cases, we optimally, non-parametrically, and efficiently discretize our data with dynamic programming, completely avoiding naive discretizations commonly used in such cases. Our method aims at identifying higher order dependencies in the data, differentiating itself from methods that search only for pairwise. To evaluate the benefits of our method, we perform extensive experiments on both synthetic and real-world data sets. The results show that our method compares favorably to state-of-the-art information-theoretic feature selection techniques.

Contact

Andrea Ruffing
--email hidden
passcode not visible
logged in users only

Andrea Ruffing, 10/23/2015 18:57 -- Created document.