MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Automatically Detecting and Mitigating Issues in Program Analyzers

Numair Mansur
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
SWS  
AG Audience
English

Date, Time and Location

Thursday, 30 March 2023
15:30
60 Minutes
G26
111
Saarbrücken

Abstract

In recent years, the formal methods community has made significant progress towards the development of industrial-strength static analysis tools that can check properties of real-world production code. Such tools can help developers detect potential bugs and security vulnerabilities in critical software before deployment. While the potential benefits of static analysis tools are clear, their usability and effectiveness in mainstream software development workflows often comes into question and can prevent software developers from using these tools to their full potential. In this dissertation, we focus on two major challenges that can prevent them from being incorporated into software development workflows.

The first challenge is unintentional unsoundness. Static program analyzers are complicated tools, implementing sophisticated algorithms and performance heuristics. This complexity makes them highly susceptible to undetected unintentional soundness issues. Soundness issues in program analyzers can cause false negatives and have disastrous consequences, e.g. when analysing  safety-critical software. In this dissertation, we present novel techniques to detect unintentional unsoundness bugs in two foundational program analysis tools, namely SMT solvers and Datalog engines. These tools are used extensively by the formal methods community, for instance, in software verification, systematic testing, and program synthesis. We implemented these techniques as easy-to-use open source tools that are publicly available on Github. With the proposed techniques, we were able to detect more than 55 unique and confirmed critical soundness bugs in popular and widely used SMT solvers and Datalog engines in only a few months of testing.

The second challenge is finding the right balance between soundness, precision, and performance. In an ideal world, a static analyzer should be as precise as possible while maintaining soundness and being sufficiently fast. However, to overcome undecidability and performance issues, these tools have to employ a variety of techniques to be practical, for example, compromising on the soundness of the analysis or approximating code behaviour.  Moreover, these tools typically don’t scale to large industrial code bases containing millions of lines of code. All of these factors make it extremely challenging to get the most out of these analyzers and integrate them into everyday development activities, especially for average software development teams with little to no knowledge or understanding of advanced static analysis techniques. In this dissertation we present an approach to automatically tailor a static analyzer to the code under analysis and any given resource constraints. We implemented our technique as an open source framework, which is publicly available on Github. The second contribution of this dissertation in this challenge area is a technique to horizontally scale analysis tools in cloud-based static analysis platforms by splitting the input to the analyzer into partitions and analyzing the partitions independently. The technique was developed in collaboration with Amazon Web Services and is now being used in production in their CodeGuru service.

Contact

Gretchen Gravelle
+49 681 9303 9102
--email hidden

Virtual Meeting Details

Zoom
passcode not visible
logged in users only

Gretchen Gravelle, 09/05/2023 11:45 -- Created document.