WHO describes antimicrobial resistance (AMR) as one of the top global public health and development threats, while the World Bank estimates its additional healthcare burden as US$ 1 trillion by 2050. One strategy to combat AMR is significantly speeding up and reducing the cost of discovering new antibiotics, particularly those existing in nature but evading all current attempts to find them. Surprisingly, computer scientists play an increasingly important role in this endeavor. Modern biotechnology allows us to amass vast amounts of data on natural antibiotics and their tiny producers. However, specialized software and algorithms are the key to unlocking this wealth of information and turning it into medically important discoveries.
In my talk, I will demonstrate how computer science methods, from classical graph and string algorithms to deep learning, help transform antibiotic discovery into a high-throughput technology and realize the promise of already collected and rapidly growing biological datasets. The particular focus will be on analyzing sequencing (strings over the DNA alphabet of {A, C, G, T} letters) and mass spectrometry data (two-dimensional arrays of floating point values). In both cases, we deal with noisy experimental data, often rely on heuristics to make the processing time reasonable, and finally, get biologically relevant findings suitable for verification by our wet-lab collaborators.