MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Generalization bounds for rational self-supervised learning algorithms

Boaz Barak
Harvard University
INF Distinguished Lecture Series
AG 1, AG 2, AG 3, INET, AG 4, AG 5, SWS, RG1, MMCI  
MPI Audience
English

Date, Time and Location

Tuesday, 27 October 2020
16:00
60 Minutes
E1 4
024
Saarbrücken

Abstract

The generalization gap of a learning algorithm is the expected difference between its performance on the training data and its performance on fresh unseen test samples. Modern deep learning algorithms typically have large generalization gaps, as they use more parameters than the size of their training set. Moreover the best known rigorous bounds on their generalization gap are often vacuous.

In this talk we will see a new upper bound on the generalization gap of classifiers that are obtained by first using self-supervision to learn a complex representation of the (label free) training data, and then fitting a simple (e.g., linear) classifier to the labels. Such classifiers have become increasingly popular in recent years, as they offer several practical advantages and have been shown to approach state-of-art results.

We show that (under the assumptions described below) the generalization gap of such classifiers tends to zero as long as the complexity of the simple classifier is asymptotically smaller than the number of training samples. We stress that our bound is independent of the complexity of the representation that can use an arbitrarily large number of parameters. Our bound holds assuming that the learning algorithm satisfies certain noise-robustness (adding small amount of label noise causes small degradation in performance) and rationality (getting the wrong label is not better than getting no label at all) properties.  These conditions widely (and sometimes provably) hold across many standard architectures. We complement this result with an empirical study, demonstrating that our bound is non-vacuous for many popular representation-learning based classifiers on CIFAR-10 and ImageNet, including SimCLR, AMDIM and BigBiGAN.

The talk will not assume any specific background in machine learning, and should be accessible to a general mathematical audience. Joint work with Yamini Bansal and Gal Kaplun.

Contact

Kurt Mehlhorn
+49 681 9325 1000
--email hidden

Video Broadcast

Yes
Zoom Meeting, see link below
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes


Christina Fries, 10/23/2020 09:00
Kurt Mehlhorn, 10/01/2020 09:19
Kurt Mehlhorn, 10/01/2020 09:18 -- Created document.