In my talk, I will discuss recent results from our group for situations in which the "future" data distribution differs from the distribution of the available training data. In particular, I will highlight new theoretical guarantees we were able to obtain for the situation of multi-task learning, and I will show how such theoretical results can give rise to new and principled machine learning algorithms.