<< Previous Entry | Next Entry >> | New Event Entry | Edit this Entry | Login to DB (to update, delete) |
Title: | From CAD models to neural networks: Learning mid-level image representations for visual recognition |
---|---|
Speaker: | Josef Sivic |
coming from: | Inria Paris, Departement d'Informatique, Ecole Normale Superieure |
Speakers Bio: | Josef Sivic received a degree from the Czech Technical University,
Prague, in 2002 and PhD from the University of Oxford in |
Event Type: | Talk |
Visibility: | D2, D4, MMCI We use this to send out email in the morning. |
Level: | AG Audience |
Language: | English |
Date: | Monday, 28 July 2014 |
---|---|
Time: | 13:00 |
Duration: | 60 Minutes |
Location: | Saarbrücken |
Building: | E1 4 |
Room: | 024 |
In this talk, I will describe our recent work on developing learnable mid-level representations for instance-level and category-level visual recognition. First, I will review our recently developed representation of 3D scenes where an entire architectural site is summarized by a set of scene parts learnt in a discriminative fashion from rendered views of its 3D model. We demonstrate recognizing 3D scene instances in challenging historical and non-photographic imagery, such as paintings and drawings, where standard local invariant features fail. Second, using a similar approach we show that an object category can be non-parametrically modeled by a large collection of 3D CAD models explicitly representing the variation in style and viewpoint. Object detection in images is posed as a type of 2D to 3D alignment accomplished by matching mid-level object parts learnt from synthesized views. We demonstrate detection and alignment of ``chairs" in challenging Pascal VOC 2012 images using a reference library of 1,394 CAD models downloaded from the Internet. Finally, we investigate learning and transferring mid-level image representations using convolutional neural networks. We demonstrate that an image representation learnt on a task with a large amount of fully labelled imagery can significantly improve visual recognition performance on related tasks where supervision is scarce. The proposed model achieves state-of-the-art results on the Pascal VOC image classification and action recognition challenge. The talk is based on recent papers: - M. Aubry, B. Russell and J. Sivic, Painting-to-3D Model Alignment Via Discriminative Visual Elements, ACM Transactions on Graphics, 2014 - M. Aubry, D. Maturana, A. Efros, B. Russell and J. Sivic, Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models, CVPR 2014 - M. Oquab, L. Bottou, I. Laptev and J. Sivic, Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks, CVPR 2014 |
Name(s): | Michael Stark |
---|
Video Broadcast: | No | To Location: |
---|
Note: | |
---|---|
Attachments, File(s): |