Patrick Pérez is CEO at Kyutai, a non-profit open-science AI lab, based in Paris. Prior to this, Patrick was at Valeo as VP of AI and Scientific Director of valeo.ai (2018-2023), and with Technicolor (2009-2018), Inria (1993-2000, 2004-2009) and Microsoft Research Cambridge (2000-2004) as research scientist. Through his academic and corporate research journey, Patrick has explored various fields of applications (medical, cinema and media, automotive) with all sorts of sensors (cameras, MRI, microphones, radar, laser scanners), and has built and led several long-term research teams. His research interests lie in reliable multimodal AI for the benefit of all.
Modern AI is reshaping our digital lives and many industries (including century-old ones). This comes with a variety of key challenges such as scalability, efficiency, reliability, safety and interpretability. I will touch upon some of these challenges, based on research projects across different fields of application, from content creation to autonomous driving. How to tap into raw unannotated data, how to leverage foundation models or world knowledge, how to predict and improve robustness of models for the open world are some of the questions that will be discussed, in the context of visual models. I will conclude with some research directions toward more capable and practical foundation models.