The goal of the field of deep learning-based image synthesis is to achieve perfect visual realism, and to let users precisely control the content of the synthetic images. Generative adversarial networks (GANs) have been the most popular image synthesis frameworkuntil recently, due to their unrivaled image quality. Yet, there is still much room for improvement regarding synthesis quality and precisely controlling the image content. For this reason, this thesis introduces methods that improve both the synthesis qualityand controllability of GANs. Specifically, we address the following subproblems. First, we propose the idea of segmentation-based discriminator networks and segmentation-based regularizations for GANs. The new approach improves the quality of conditional andunconditional image synthesis. Second, we show that this approach is naturally well-suited for semantic image synthesis. Centered around the idea of segmentation-based discriminators, this thesis introduces techniques that strongly improve image quality andmulti-modality. Additionally, the methods result in better modeling of long-tailed data and new possibilities for global and local image editing. Finally, the improvements in multi-modality and image editing in semantic image synthesis open the door for controllingthe image content via the latent space of the GAN generator. Therefore, this thesis introduces a method for finding interpretable directions in the latent space of semantic image synthesis GANs, which enables an additional form of control over the image contentnext to the semantic layouts.