Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2019) | last year (2018) | two years ago (2017) | Notes URL

Action:

login to update

Options:








Author, Editor

Author(s):

Li, Zhao
Herfet, Thorsten
Grochulla, Martin
Thormählen, Thorsten

dblp
dblp
dblp
dblp

Not MPG Author(s):

Li, Zhao
Herfet, Thorsten

Editor(s):





BibTeX cite key*:

Grochulla2012a

Title, Booktitle

Title*:

Multiple active speaker localization based on audio-visual fusion in two stages

Booktitle*:

2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)

Event, URLs

URL of the conference:

http://mfi-2012.informatik.uni-hamburg.de/

URL for downloading the paper:


Event Address*:

Hamburg, Germany

Language:

English

Event Date*
(no longer used):


Organization:


Event Start Date:

13 September 2012

Event End Date:

15 September 2012

Publisher

Name*:

IEEE

URL:

http://www.ieee.org

Address*:

Piscataway, NJ

Type:


Vol, No, Year, pp.

Series:


Volume:


Number:


Month:

November

Pages:

262-268

Year*:

2012

VG Wort Pages:


ISBN/ISSN:

978-1-4673-2510-3

Sequence Number:


DOI:

10.1109/MFI.2012.6343015



Note, Abstract, ©


(LaTeX) Abstract:

Localization of multiple active speakers in natural environments with only two microphones is a challenging problem. Reverberation degrades performance of speaker localization based exclusively on directional cues. The audio modality alone has problems with localization accuracy while the video modality alone has problems with false speaker activity detections. This paper presents an approach based on audiovisual fusion in two stages. In the first stage, speaker activity is detected based on the audio-visual fusion which can handle false lip movements. In the second stage, a Gaussian fusion method is proposed to integrate the estimates of both modalities. As a consequence, the localization accuracy and robustness compared to the audio/video modality alone is significantly increased. Experimental results in various scenarios confirmed the improved performance of the proposed system.



Download
Access Level:

Internal

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Computer Graphics Group

Audience:

experts only

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{Grochulla2012a,
AUTHOR = {Li, Zhao and Herfet, Thorsten and Grochulla, Martin and Thorm{\"a}hlen, Thorsten},
TITLE = {Multiple active speaker localization based on audio-visual fusion in two stages},
BOOKTITLE = {2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)},
PUBLISHER = {IEEE},
YEAR = {2012},
PAGES = {262--268},
ADDRESS = {Hamburg, Germany},
MONTH = {November},
ISBN = {978-1-4673-2510-3},
DOI = {10.1109/MFI.2012.6343015},
}


Entry last modified by Oliver Klehm, 02/12/2013
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
Martin Peter Grochulla
Created
02/08/2013 05:05:48 PM
Revision
1.
0.


Editor
Oliver Klehm
Martin Peter Grochulla


Edit Date
02/12/2013 07:13:04 PM
02/08/2013 05:05:48 PM