Kaggle Competition: A Biased Metric and the Unexpected Power of XGBoost

2025-02-23

The author participated in a Kaggle competition to predict survival chances after a bone marrow transplant. The competition's evaluation metric is a stratified concordance score designed to avoid overly disparate predictions for different racial groups. However, this metric has flaws: improving the score for one group doesn't always improve the overall score; it can even decrease it. While using an XGBoost model, the author found that simple decision tree ensemble models were more effective than complex statistical models, and explored the differences between statistical and machine learning approaches. Finally, the author discovered that adjusting the scale parameter of the AFT distribution significantly impacted model accuracy and posed several open questions for improving the model.

Development