People are often a central element of visual scenes, particularly in real-world street scenes. Thus it has been a long-standing goal in Computer Vision to develop methods aiming at analyzing humans in visual data. Due to the complexity of real-world scenes, visual understanding of people remains challenging for machine perception. In this defense, I will discuss a number of diverse tasks that aim to enable vision systems to analyze people in realistic images and videos. In particular, I will present several novel models and algorithms which push the boundary of state-of-the-arts and result in superior performance.