top of page

Can a Random Forest–based machine learning model accurately predict NBA players’ Points Per Game, and does it outperform traditional linear approaches?

Faisal Abu El Afieh
23/05/2026

The aim of this research is to develop a predictive machine learning model capable of estimating NBA players’ Points Per Game (PPG) using a combination of statistical, categorical, and performance-based features. Importantly, this study frames the task as within-season reconstruction rather than true out-of-sample forecasting, since all predictor variables are drawn from the same season as the target. By applying regression modeling with the use of Linear Regression, Random Forest regression and neural networks, this research examines the predictive power of basketball statistics such as shooting efficiency, playing time, and position. Using RandomForestRegressor as the final model, the study achieved superior predictive accuracy with an R² score of 0.83 and RMSE of 3.51. The results reveal that field goal attempts, minutes played, and true shooting percentage are the most important predictors of scoring output. Feature importance rankings were derived from impurity-based measures, which should be interpreted with caution given the presence of correlated predictors. This study demonstrates how machine learning can provide meaningful insights into player performance and may be used for scouting, coaching, and fantasy analytics.

Previous

 

Wilmington, Delaware, 19801

ISSN: 3070-3875

DOI: 10.65161

 

The Oxford Journal of Student Scholarship (ISSN: 3070-3875) is an independent publication and is not affiliated with, endorsed by, or connected to the University of Oxford or any of its colleges, departments, or programs.

 

© 2025 by the Oxford Journal of Student Scholarship 

 

bottom of page