MPI-INF D2 Publications: Proceedings Article: Generating Visual Explanations

Publications

Server domino.mpi-inf.mpg.de

Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop

Author, Editor

Author(s):

Hendricks, Lisa Anne
Akata, Zeynep
Rohrbach, Marcus
Donahue, Jeff
Schiele, Bernt
Darrell, Trevor

dblp
dblp
dblp
dblp
dblp
dblp

Not MPG Author(s):

Hendricks, Lisa Anne
Rohrbach, Marcus
Donahue, Jeff
Darrell, Trevor

Editor(s):

BibTeX cite key*:

Akata2016d

Title, Booktitle

Title*:	Generating Visual Explanations
Booktitle*:	The 14th European Conference on Computer Vision (ECCV)

Event, URLs

Conference URL::	http://www.eccv2016.org/
Downloading URL:	https://www.mpi-inf.mpg.de/fileadmin/inf/d2/akata/generating-visual-explanations.pdf
Event Address*:	Amsterdam, The Netherlands	Language:	English
Event Date* (no longer used):		Organization:
Event Start Date:	8 October 2016	Event End Date:	16 October 2016

Publisher

Name*:	Springer	URL:	http://www.springer.com/de/shop?wt_mc=PPC.Google%20AdWords.3.EPR436.DAL_Brand_Springer&gclid=CM3LnJCskc4CFQoo0wodJUIPgg
Address*:	Tiergartenstraße 17, 69121 Heidelberg	Type:

Vol, No, Year, pp.

Series:

Volume:		Number:
Month:		Pages:
Year*:	2016	VG Wort Pages:
ISBN/ISSN:		Sequence Number:
DOI:

Note, Abstract, ©


(LaTeX) Abstract:	Clearly explaining a rationale for a classification decision to an end user can be as important as the decision itself. Existing approaches for deep visual recognition are generally opaque and do not output any justification text; contemporary vision-language models can describe image content but fail to take into account class-discriminative image aspects which justify visual predictions. We propose a new model that focuses on the discriminating properties of the visible object, jointly predicts a class label, and explains why the predicted label is appropriate for the image. Through a novel loss function based on sampling and reinforcement learning, our model learns to generate sentences that realize a global sentence property, such as class specificity. Our results on the CUB dataset show that our model is able to generate explanations which are not only consistent with an image but also more discriminative than descriptions produced by existing captioning methods.


Download Access Level:	Internal

Correlation

MPG Unit:	Max-Planck-Institut für Informatik

MPG Subunit:	Computer Vision and Multimodal Computing
Audience:	experts only
Appearance:	MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort

BibTeX Entry:

@INPROCEEDINGS{Akata2016d,
AUTHOR = {Hendricks, Lisa Anne and Akata, Zeynep and Rohrbach, Marcus and Donahue, Jeff and Schiele, Bernt and Darrell, Trevor},
TITLE = {Generating Visual Explanations},
BOOKTITLE = {The 14th European Conference on Computer Vision (ECCV)},
PUBLISHER = {Springer},
YEAR = {2016},
ADDRESS = {Amsterdam, The Netherlands},
}

Entry last modified by Zeynep Akata, 07/26/2016

Edit History (please click the blue arrow to see the details)

	Editor(s) Zeynep Akata	Created 07/26/2016 16:33:03
Revision 1. 0.	Editor Zeynep Akata Zeynep Akata	Edit Date 07/26/2016 04:33:58 PM 07/26/2016 04:33:03 PM

Imprint / Impressum | Data Protection / Datenschutzhinweis