
The Role of Information Theory in Asthma Prediction
Elaine Peng
20/04/2026
Asthma is a chronic condition where early prediction is often hindered by complex interactions between genetic and environmental factors. Traditional feature selection methods usually rely on assumptions that might not fully capture the complexity of these interactions. This study investigates whether Information Gain (IG), a non-parametric entropy-based metric, can improve asthma prediction accuracy compared to traditional statistical methods. It was hypothesized that IG will outperform correlation, ANOVA, and logistic regression by more effectively capturing complex relationships seen in asthma development. Using a shared framework for prediction, four different feature selection protocols were compared against each other: correlation matrices, ANOVA F-tests, multivariable logistic regression, and Information Gain. The models using ANOVA and logistic regression mostly matched the baseline performance (F1: 0.83/0.84, AUC: 0.90). On the other hand, the Information Gain model achieved better results with an F1 score of 0.87 and an AUC of 0.93. It can possibly be concluded that entropy-based feature selection effectively identifies informative predictors in complex biological datasets to offer a potentially more effective feature selection approach for asthma disease prediction within this experimental framework.