MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Audiovisual Data Processing for Robust Human-Machine-Communication and Media Retrieval

Björn Schuller
Lehrstuhl für Mensch-Maschine-Kommunikation, TU München, Germany
Talk

Björn Schuller received his diploma (1999) and his doctoral degrees (2006) in electrical engineering and information technology for his works in Automatic Speech and Emotion Recognition from TUM (Munich University of Technology), one of Germany's first three Excellence Universities, where he currently stays as senior researcher and lecturer in Pattern Recognition and Speech Processing as officially acknowledged candidate for the PD Dr.-Ing. habil. degree. He is a member of the ACM, IEEE, and ISCA, and authored and co-authored more than 100 publications in books, journals and peer reviewed conference proceedings in the field of signal processing, and machine learning. Best known are his works advancing Speech Processing, Affective Computing, and Music Information Retrieval. He served as reviewer for several scientific journals, and as invited speaker, session and challenge organizer and chairman, and programme committee member of numerous international conferences. Project steering board activity and involvement in current and past research projects include SEMAINE funded by the European Community's Seventh Framework Programme, the HUMAINE CEICES initiative, and projects funded by companies as BMW, Continental, Daimler, Siemens, Toyota, and VDO. Advisory board activities comprise his membership as invited expert in the W3C Emotion Incubator and Emotion Markup Language Incubator Groups, and his election into the Executive Committee of the HUMAINE Association where he chairs the Special Interest Group on Emotion Recognition from Speech.
AG 4  
AG Audience
English

Date, Time and Location

Tuesday, 17 March 2009
13:00
30 Minutes
E1 4
019
Saarbrücken

Abstract

Audiovisual signal processing approaches are widely agreed to be superior to their unimodal counterparts with respect to robustness, fail-safeness or comfort to users in a multiplicity of Human-Computer Interaction and Multimedia Retrieval tasks. Typical application scenarios comprise both synergistic and concurrent multimodality. The main problem of integration thereby usually is the asynchrony of the audio and video cues, or textual information. In this respect this talk aims at provision of a short introduction to early, late and hybrid integration strategies. Emphasis is laid on preservation of utmost available knowledge within synchronization and integration of streams. To this aim diverse machine learning approaches comprising Graphical Models, Multidimensional Dynamic Time Warping, and Meta-Classification are discussed. Insight in their effectiveness is given by a number of recent applications scenarios as multimodal Emotion and Behaviour Recognition, Meeting Segmentation, and Music Retrieval, selected for coverage of the named types.

Contact

Meinard Müller
+49 681 9325 405
--email hidden
passcode not visible
logged in users only

Thorsten Thormählen, 02/25/2009 11:03
Thorsten Thormählen, 02/24/2009 11:28
Thorsten Thormählen, 02/24/2009 11:27 -- Created document.