We provide a method for the systematic identification of atypical genes within virus families. Genes that were subject of recent horizontal gene transfer are assumed to display the statistical features of the original genome rather than those of the genome in which they are observed. We present a one-class support vector machine approach to detect atypical genes within a virus family on the basis of their statistical signatures.
Evaluation of the approach on simulated data attests its capability to robustly identify alien genes. Tests on real data and comparison to literature confirm its value in practice. With the capability to rank genes from atypical to typical for a virus family, the algorithm provides a useful tool to identify the most promising candidates for horizontal gene transfer.