The task of selecting relevant features in classification problems can be viewed as one of the most fundamental problems in the field of machine learning. A major motivation for selecting a subset of features from which a learning rule is constructed is the interest in sparse and interpretable rules which hopefully leads to a better understanding of the underlying problem structure. While the problem of selecting features in supervised learning scenarios has been studied widely in the literature, the unsupervised clustering and selection problem is still a challenging task due to the absence of class labels that would guide the search for relevant information. A novel approach to combining clustering and feature selection problem is presented. It implements a wrapper strategy for feature selection, in the sense that the features are directly selected by optimizing the discriminative power of the used partitioning algorithm. On the technical side, an efficient optimization algorithm with guaranteed local convergence is presented. Experiments for real-world problems from the fields of cancer diagnostics and image analysis effectively demonstrate that the method is able to infer both meaningful partitions and meaningful subsets of features.