One-Click Regression Benchmarking with SHAP Explainability: An Integrated Python Pipeline for Linear, Regularized, and Tree-Boosting Models
Keywords:
regression benchmarking, model comparison, Explainable AI, machine learning pipelineAbstract
Applied researchers often need a fast, reproducible way to (a) compare multiple regression algorithms under a consistent preprocessing and evaluation protocol and (b) interpret model behavior beyond scalar accuracy metrics. This paper presents a turnkey Python pipeline that benchmarks five widely used regressors-Ordinary Least Squares, Ridge, Random Forests, XGBoost, and LightGBM-while natively integrating SHAP-based explainability. The system accepts mixed-type datasets, performs robust preprocessing (median imputation and standardization for numeric predictors; most-frequent imputation and one-hot encoding for categorical predictors), and evaluates models on a hold-out set using R², MSE, RMSE, and MAE. Results are exported to a clean, analysis-ready Excel workbook to facilitate immediate reuse in empirical reports. To move beyond aggregate metrics, the pipeline automatically generates SHAP global importance summaries (bar and beeswarm) and feature-dependence plots with interaction highlighting, providing multi-level insight into main effects and potential interactions. The implementation is designed for portability and minimal configuration: users specify a data file and target column, and optional flags control test split, random seed, and the number of visualizations. When no data are provided, a synthetic mixed-type dataset is generated to demonstrate the full workflow end-to-end. By combining standardized benchmarking with model-agnostic interpretability, the proposed tool lowers the barrier to rigorous, transparent model comparison and accelerates the translation of machine-learning methods into substantive research across domains.References
1. P. Biecek, and T. Burzykowski, "Explanatory model analysis: explore, explain, and examine predictive models," Chapman and Hall/CRC, 2021.
2. L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, 2001. doi: 10.1023/a:1010933404324
3. T. Chen, "XGBoost: A Scalable Tree Boosting System," Cornell University, 2016.
4. J. H. Friedman, "Greedy function approximation: a gradient boosting machine," Annals of statistics, pp. 1189-1232, 2001.
5. S. E. Lazic, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn," .
6. A. E. Hoerl, and R. W. Kennard, "Ridge regression: Biased estimation for nonorthogonal problems," Technometrics, vol. 12, no. 1, pp. 55-67, 1970. doi: 10.1080/00401706.1970.10488634
7. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, and T. Y. Liu, "Lightgbm: A highly efficient gradient boosting decision tree," Advances in neural information processing systems, vol. 30, 2017.
8. D. A. Baker, "Predictive modeling," In Clostridium botulinum, 2018, pp. 343-406. doi: 10.1201/9781315139623-14
9. S. M. Lundberg, and S. I. Lee, "A unified approach to interpreting model predictions," Advances in neural information processing systems, vol. 30, 2017.
10. N. Mirbahar, A. A. Laghari, and K. Kumar, "Enhancing Mobile App Recommendations with Crowdsourced Educational Data Using Machine Learning and Deep Learning," IEEE Access, 2025.
11. M. Christoph, "Interpretable machine learning: A guide for making black box models explainable," 2020.
12. M. N. I. Opu, M. R. Islam, M. A. Kabir, M. S. Hossain, and M. M. Islam, "Learn2write: augmented reality and machine learning-based mobile app to learn writing," Computers, vol. 11, no. 1, p. 4, 2021. doi: 10.3390/computers11010004
13. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, and O. Grisel, ",," .. & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, vol. 12, pp. 2825-2830, 2011.
14. M. T. Ribeiro, S. Singh, and C. Guestrin, "" Why should i trust you?" Explaining the predictions of any classifier," In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, August, 2016, pp. 1135-1144.
15. F. Schmähling, J. Martin, and C. Elster, "A framework for benchmarking uncertainty in deep regression," Applied Intelligence, vol. 53, no. 8, pp. 9499-9512, 2023. doi: 10.1007/s10489-022-03908-3
16. I. Triantafyllou, I. C. Drivas, and G. Giannakopoulos, "How to utilize my app reviews? A novel topics extraction machine learning schema for strategic business purposes," Entropy, vol. 22, no. 11, p. 1310, 2020. doi: 10.3390/e22111310

