Based on HIV's high rate of replication and mutation, it is able to escape
from drug pressure by developing drug resistance. Mutational patterns causing
resistance have been identified by several machine learning methods. But how
resistance-associated mutations accumulate is not well studied. Characterizing
the accumulation will help us to understand the virus' evolutionary
process.
A new model, namely mixture of mutagnetic trees, based on EM-like
algorithm has been proposed to tackle this problem. In order to improve
the results, we will incorporate a regularization framework for the
mixture model, which is applicable to the procedure of model selection.
Furthermore, we will propose a measurement of distance between two models to
validate the stability of mixture of trees and our algorithm.