The first challenge is unintentional unsoundness. Static program analyzers are complicated tools, implementing sophisticated algorithms and performance heuristics. This complexity makes them highly susceptible to undetected unintentional soundness issues. Soundness issues in program analyzers can cause false negatives and have disastrous consequences, e.g. when analysing safety-critical software. In this dissertation, we present novel techniques to detect unintentional unsoundness bugs in two foundational program analysis tools, namely SMT solvers and Datalog engines. These tools are used extensively by the formal methods community, for instance, in software verification, systematic testing, and program synthesis. We implemented these techniques as easy-to-use open source tools that are publicly available on Github. With the proposed techniques, we were able to detect more than 55 unique and confirmed critical soundness bugs in popular and widely used SMT solvers and Datalog engines in only a few months of testing.
The second challenge is finding the right balance between soundness, precision, and performance. In an ideal world, a static analyzer should be as precise as possible while maintaining soundness and being sufficiently fast. However, to overcome undecidability and performance issues, these tools have to employ a variety of techniques to be practical, for example, compromising on the soundness of the analysis or approximating code behaviour. Moreover, these tools typically don’t scale to large industrial code bases containing millions of lines of code. All of these factors make it extremely challenging to get the most out of these analyzers and integrate them into everyday development activities, especially for average software development teams with little to no knowledge or understanding of advanced static analysis techniques. In this dissertation we present an approach to automatically tailor a static analyzer to the code under analysis and any given resource constraints. We implemented our technique as an open source framework, which is publicly available on Github. The second contribution of this dissertation in this challenge area is a technique to horizontally scale analysis tools in cloud-based static analysis platforms by splitting the input to the analyzer into partitions and analyzing the partitions independently. The technique was developed in collaboration with Amazon Web Services and is now being used in production in their CodeGuru service.