Campus Event Calendar: Krishna Gummadi (09/04/2024 in E1 5/002)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Towards Better Foundations for Foundational Models: A Cognitivist Approach to Studying Large Language Models (LLMs)

Krishna Gummadi

Max Planck Institute for Software Systems

Joint Lecture Series

AG 1, AG 2, AG 3, INET, AG 4, AG 5, D6, SWS, RG1, MMCI

Public Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Wednesday, 4 September 2024

12:15

60 Minutes

E1 5

002

Saarbrücken

Abstract

I will begin by demoing and releasing an LLM-based assistant that allows scientists to convert their papers (with a simple drag and drop) into short podcasts for communicating their research to a general audience. While we built the tool, we can’t explain its unreasonable effectiveness, i.e., we don’t really understand why it works or when it might fail. So in the rest of the talk, I will present our investigations into some basic curiosity-driven questions about LLMs; specifically, how do LLMs receive, process, organize, store, and retrieve information.

Our analysis is centered around engaging LLMs in two specific types of cognitive tasks: first, syntactically-rich (semantically-poor) tasks such as recognizing formal grammars, and next, semantically-rich (syntactically-poor) tasks such as answering factual knowledge questions about real-world entities. Using carefully designed experimental frameworks, we attempt to answer the following foundational questions:
(a) how can we estimate what latent skills and knowledge a (pre-trained) LLM possesses?
(b) (how) can we distinguish whether some LLM has learnt some training data by rote vs. by understanding?
(c) what is the minimum amount of training data/costs needed for a (pre-trained) LLM to acquire a new skill or knowledge?
(d) when solving a task, is training over task examples better, worse, or similar to providing them as in-context demonstrations?

I will present some initial empirical results from experimenting with a number of large open-source language models and argue that our findings they have important implications for the privacy of training data (including potential for memorization), the reliability of generated outputs (including potential for hallucinations), and the robustness of LLM-based applications (including our podcast assistant for science communication).

Contact

Jennifer Müller

+49 681 9325 2900

--email hidden

Virtual Meeting Details

System used:

Zoom

Meeting URL:

https://zoom.us/j/99715655535

Meeting ID:

997 1565 5535

Passcode:

passcode not visible

Code Visible for:

logged in users only

Jennifer Müller, 08/27/2024 14:03 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis