New for: D1, D2, D3, INET, D4, D5, D6
Recently, the mechanistic interpretation of this reasoning process has gained significant attention, which is concerned with understanding the *internal reasoning process* of a network, including what information particular neurons respond to and how these specific neurons are organized into larger circuits. In this talk, I will (1) gently introduce the topic of mechanistic interpretability in machine learning, (2) show how to discover mechanistic circuits within a neural network, (3) discuss the relevance of mechanistic interpretability in real-world applications, and (4) discuss what is still missing in the field.