Campus Event Calendar: Hasan Eniser (09/25/2024 in G26/111)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

New for: D1, D2, D3, INET, D4, D5, D6

What and Who

Specifying and Fuzzing Machine-Learning Models

Hasan Eniser

Max Planck Institute for Software Systems

SWS Student Defense Talks - Thesis Proposal

AG 1, AG 2, AG 3, INET, AG 4, AG 5, D6, SWS, RG1, MMCI

AG Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Wednesday, 25 September 2024

09:00

60 Minutes

G26

111

Kaiserslautern

Abstract

Machine Learning (ML) models are now integral to many critical systems, from self-driving cars to aviation, where their reliability and safety are crucial. Validating that these models perform their intended functions without failure is essential to prevent catastrophic outcomes. This thesis introduces novel tools and approaches inspired by software testing to specify and fuzz ML models for their functional correctness. By leveraging fuzzing and metamorphic testing techniques, we address the challenges of generating test inputs and defining test oracles for ML models. We begin by focusing on sequential decision-making problems, developing techniques to test action policies for reliability. Our PI-fuzz framework identifies bugs by generating diverse test states and applying test oracles relying on metamorphic relations. We then formalize metamorphic relations as hyperproperties and show their generalization across diverse domains and ML models. This led to the development of NOMOS, a declarative, domain-agnostic specification language for expressing and testing these hyperproperties. NOMOS is shown to be effective in identifying property violations across various ML domains. Additionally, we extend NOMOS to support code translation models. We evaluate several state-of-the-art models against a range of hyperproperties, uncovering numerous violations. This work contributes to the field by providing a comprehensive framework for assessing the reliability and safety of ML models in various applications.

Contact

Susanne Girard

+49 631 9303 9605

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

see notes

Carina Schmitt, 12/06/2024 14:32
Susanne Girard, 09/19/2024 12:59
Susanne Girard, 09/19/2024 12:54 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis