Determining the optimal number of clusters (or regressors) which best fits a dataset is a classic problem. Recently, much attention has been drawn to Dirichlet Processes (DP). The Dirichlet is the conjugate prior to the mulitnomial, the natural descriptor of a priori cluster likelihood. A DP is a distribution over Dirichlet distributions, and the use of a DP to describe the number of clusters allows for a nonparametric Bayesian specification that selects the number of clusters based on the support in the data. The advantage is that the approach makes minimal assumptions as to the nature and structure of the data, and is thus less likely to be lead astray when such assumptions are violated-- contrast thsi with selection based on say the AIC or BIC. Dirichlet process models are more prosaically referred to as Chinese Restaurant Process (CRP) models, as their marginal conditional likelihood can be elegantly illustrated by analogy to table-sharing, a reputed practice in Chinese restaurants.