MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Towards Holistic Machines: From Visual Recognition To Question Answering About Real-World Images

Mateusz Malinowski
Max-Planck-Institut für Informatik - D2
Promotionskolloquium
AG 1, AG 2, AG 3, AG 4, AG 5, RG1, SWS, MMCI  
Public Audience
English

Date, Time and Location

Tuesday, 20 June 2017
17:00
60 Minutes
E1 4
024
Saarbrücken

Abstract

Computer Vision has undergone major changes over the recent five years.

With the
advances on Deep Learning and creation of large-volume datasets, the
progress becomes
particularly strong on the image classification task.
Therefore, we investigate if the performance of such architectures
generalizes to more complex
tasks that require a more holistic approach to scene comprehension.

The presented work focuses on learning spatial and multi-modal
representations,
and the foundations of a Visual Turing Test, where the scene understanding
is tested by a series of questions about its content.
In our studies, we propose DAQUAR, the first ‘question answering about
real-world images’ dataset together
with methods, termed a symbolic-based and a neural-based visual question
answering architectures,
that address the problem. The symbolic-based method relies on a semantic
parser, a database of visual facts, and
a bayesian formulation that accounts for various interpretations of the
visual scene.
The neural-based method is an end-to-end architecture composed of a
question encoder, image encoder,
multimodal embedding, and answer decoder. This architecture has proven to
be effective in capturing
language-based biases. It also becomes the standard component of other
visual question answering architectures.
Along with the methods, we also investigate various evaluation metrics
that embraces uncertainty
in word's meaning, and various interpretations of the scene and the question.

Contact

Connie Balzert
9325-2000
--email hidden
passcode not visible
logged in users only

Connie Balzert, 06/08/2017 10:07 -- Created document.