Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Information-Theoretic Feature Selection for Continuous Data
Speaker:Panagiotis Mandros
coming from:International Max Planck Research School for Computer Science - IMPRS
Speakers Bio:
Event Type:PhD Application Talk
Visibility:D1, D2, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Language:English
Date, Time and Location
Date:Monday, 26 October 2015
Time:11:20
Duration:90 Minutes
Location:Saarbr├╝cken
Building:E1 4
Room:024
Abstract
Feature selection is a dimensionality reduction process that judiciously selects a subset of features from which to build a prediction model. In particular, it alleviates computational and high dimensionality problems by reducing the number of dimensions one has to consider. Recognizing the

importance of feature selection to data analysis, in this thesis we aim to contribute with an Exploratory Data Analysis approach the discipline of working directly with empirical data without making unnecessary assumptions (e.g. assumptions on data distributions).
With this goal in mind, we propose a parameter and assumption free supervised information-theoretic feature selection method for data with continuous attributes and discrete/nominal class label. The novelty of our method lies in working directly with continuous data by employing Cumulative Entropy, a new entropy measure designed specifically for continuous random variables. While our objective is to work directly with continuous data, we will show that for some computations we need to perform discretization a process converting continuous data to discrete one. In such cases, we optimally, non-parametrically, and efficiently discretize our data with dynamic programming, completely avoiding naive discretizations commonly used in such cases. Our method aims at identifying higher order dependencies in the data, differentiating itself from methods that search only for pairwise. To evaluate the benefits of our method, we perform extensive experiments on both synthetic and real-world data sets. The results show that our method compares favorably to state-of-the-art information-theoretic feature selection techniques.

Contact
Name(s):Andrea Ruffing
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):

Created by:Andrea Ruffing/MPI-INF, 10/23/2015 06:55 PMLast modified by:Uwe Brahm/MPII/DE, 11/24/2016 04:13 PM
  • Andrea Ruffing, 10/23/2015 06:57 PM -- Created document.