MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Scanning Trojaned Models Using Out-of-Distribution Samples

Bahar Dibaei Nia
Sharif University of Technology
PhD Application Talk
AG 1, AG 2, AG 3, INET, AG 4, AG 5, D6, SWS, RG1, MMCI  
AG Audience
English

Date, Time and Location

Tuesday, 4 February 2025
13:30
30 Minutes
Virtual talk
zoom

Abstract

Detecting trojans (backdoors) in deep neural networks is critical due to their real-world implications. Current methods often rely on assumptions about the attack type and struggle with trojaned classifiers trained using adversarial techniques. To address these challenges, we introduce TRODO (TROjan scanning by Detection of adversarial shifts in Out-of-distribution samples), a novel attack-agnostic method that identifies "blind spots" where trojaned classifiers misclassify out-of-distribution (OOD) samples as in-distribution (ID). By adversarially shifting OOD samples toward ID, TRODO detects trojans without prior knowledge of attack types or reliance on training data. The method is robust, adaptable, and effective across diverse scenarios, including adversarially trained classifiers, making it a promising approach to trojan scanning.

Contact

Ina Geisler
+49 681 9325 1802
--email hidden

Virtual Meeting Details

Zoom
passcode not visible
logged in users only

Ina Geisler, 01/27/2025 09:41 -- Created document.