New for: D1, D2, D3, D4, D5
Metagenome studies have retrieved vast amounts of sequence data from a variety of environments leading to novel discoveries and insights into the uncultured microbial world. Except for very simple communities, the encountered diversity has made fragment assembly and the subsequent analysis a challenging problem. A taxonomic characterization of such sequences is required for deeper understanding of such microbial communities, but success has mostly been limited to sequences containing phylogenetic marker genes. We present PhyloPythia, a composition-based classifier which combines higher level generic clades from a set of 340 completed genomes with sample-derived population models. Extensive analyses on synthetic and real metagenome data sets showed that PhyloPythia allows the accurate classification of most sequence fragments across all considered taxonomic ranks, even for unknown organisms. The method requires no more than 100 kb of training sequence for the creation of accurate models of sample-specific populations and can assign fragments >= 1 kb with high specificity. PhyloPythia has already been applied to create more detailed process-level annotations for several real metagenome data sets, such as a microbial community inhabiting the hindgut of higher termites, which is highly interesting w.r.t. current efforts of biofuel generation.
Secondly, I will talk about the group's research in the area of influenza evolution. Preliminary results will be discussed. Endemic influenza by estimates causes 100.000 deaths each year, in particular among young children and the elderly. Immunity can be achieved either by infection or vaccination, but is non permanent, due to the rapid evolution of the virus. Accordingly, the composition of influenza vaccines is updated annually, following suggestions of the World Health Organization. Although a successful match to the dominant circulating strain is achieved in the majority of cases, there is room for improvement. Using statistical modeling and theoretical simulations with a stochastic epidemiological model, we are investigating the short-term evolutionary dynamics of the virus and determining the information inherent in various genotype-related properties with respect to vaccine strain selection.