MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Articulated People Detection and Pose Estimation in Challenging Real World Environments

Leonid Pishchulin
Max-Planck-Institut für Informatik - D2
Promotionskolloquium
AG 1, AG 2, AG 3, AG 4, AG 5, RG1, SWS, MMCI  
Public Audience
English

Date, Time and Location

Tuesday, 31 May 2016
16:00
60 Minutes
E1 4
021
Saarbrücken

Abstract

In this thesis we are interested in the problem of articulated people detection and pose

estimation being key ingredients towards understanding visual scenes containing
people. Although extensive efforts are being made to address these problems, we
identify three promising directions that, we believe, didn’t get sufficient attention
recently.
First, we investigate how statistical 3D human shape models from computer
graphics can be leveraged to ease training data generation. We propose a range
of automatic data generation techniques that allow to directly represent relevant
variations in the training data. Sampling from both the underlying human shape
distribution and a large dataset of human poses allows to generate novel samples
with controllable shape and pose variations that are relevant for the task at hand.
Furthermore, we improve the state-of-the-art 3D human shape model itself by rebuilding
it from a large commercially available dataset of 3D bodies.
Second, we develop expressive spatial and strong appearance models for 2D
single- and multi-person pose estimation. We propose an expressive single person
model that incorporates higher order part dependencies while remaining efficient.
We augment this model with various types of strong appearance representations
aiming to substantially improve the body part hypotheses. Finally, we propose
an expressive model for joint pose estimation of multiple people. To that end, we
develop strong deep learning based body part detectors and an expressive fully
connected spatial model. The proposed approach treats multi-person pose estimation
as a joint partitioning and labeling problem of a set of body part hypotheses: it infers
the number of persons in a scene, identifies occluded body parts and disambiguates
body parts between people in close proximity of each other.
Third, we perform thorough evaluation and performance analysis of leading
human pose estimation and activity recognition methods. To that end we introduce a
novel benchmark that makes a significant advance in terms of diversity and difficulty,
compared to the previous datasets, and includes over 40, 000 annotated body poses
and over 1.5M frames. Furthermore, we provide a rich set of labels which are used to
perform a detailed analysis of competing approaches gaining insights into successes
and failures of these methods.
In summary, this thesis presents a novel approach to articulated people detection
and pose estimation. Thorough experimental evaluation on standard benchmarks
demonstrates significant improvements due to the proposed data augmentation techniques
and novel body models, while detailed performance analysis of competing
approaches on our newly introduced large-scale benchmark allows to identify the
most promising directions of improvement.

Contact

Connie Balzert
0681 9325-2000
--email hidden
passcode not visible
logged in users only

Connie Balzert, 04/11/2016 12:29 -- Created document.