Complex statistical models are increasingly being used or considered for use in high-stakes decision-making pipelines in domains such as financial services, health care, criminal justice and human services. These models are often investigated as possible improvements over more classical tools such as regression models or human judgement. While the modeling approach may be new, the practice of using some form of risk assessment to inform decisions is not. When determining whether a new model should be adopted, it is essential to be able to compare the proposed model to the existing approach across a range of task-relevant accuracy and fairness metrics. In this talk I will describe a subgroup analysis approach for characterizing how models differ in terms of fairness-related quantities such as racial or gender disparities. I will also talk about an ongoing collaboration with the Allegheny County Department of Health and Human Services on developing and implementing a risk assessment tool for use in child welfare referral screening.