Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Towards Holistic Machines: From Visual Recognition To Question Answering About Real-World Images
Speaker:Mateusz Malinowski
coming from:Max-Planck-Institut für Informatik - D2
Speakers Bio:
Event Type:Promotionskolloquium
Visibility:D1, D2, D3, D4, D5, RG1, SWS, MMCI
We use this to send out email in the morning.
Level:Public Audience
Date, Time and Location
Date:Tuesday, 20 June 2017
Duration:60 Minutes
Building:E1 4
Computer Vision has undergone major changes over the recent five years.

With the
advances on Deep Learning and creation of large-volume datasets, the
progress becomes
particularly strong on the image classification task.
Therefore, we investigate if the performance of such architectures
generalizes to more complex
tasks that require a more holistic approach to scene comprehension.

The presented work focuses on learning spatial and multi-modal
and the foundations of a Visual Turing Test, where the scene understanding
is tested by a series of questions about its content.
In our studies, we propose DAQUAR, the first ‘question answering about
real-world images’ dataset together
with methods, termed a symbolic-based and a neural-based visual question
answering architectures,
that address the problem. The symbolic-based method relies on a semantic
parser, a database of visual facts, and
a bayesian formulation that accounts for various interpretations of the
visual scene.
The neural-based method is an end-to-end architecture composed of a
question encoder, image encoder,
multimodal embedding, and answer decoder. This architecture has proven to
be effective in capturing
language-based biases. It also becomes the standard component of other
visual question answering architectures.
Along with the methods, we also investigate various evaluation metrics
that embraces uncertainty
in word's meaning, and various interpretations of the scene and the question.

Name(s):Connie Balzert
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Attachments, File(s):

Connie Balzert/MPI-INF, 06/08/2017 10:07 AM
Last modified:
halma/MPII/DE, 10/29/2017 12:00 AM
  • Connie Balzert, 06/08/2017 10:07 AM -- Created document.