Tree-based models using random grid search optimization for disease classification based on environmental factors : a case study on asthma hospitalizations

Prathayne Nanthakumaran, Liwan Liyanage

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

1 Citation (Scopus)

Abstract

An understanding on the exposure to environmental factors aggravating global disease burden can aid mitigating it. Generally, a class of generalized linear models and generalized additive models are used in predicting disease burden whereas, tree-based models are underused. The objective of this paper is to evaluate the performance of different tree-based models namely decision tree, random forest, gradient boosted tree and stochastic gradient boosted trees in predicting asthma attack based on short-term exposure to environmental factors and to examine the environmental factors triggering asthma attack. A sample of patients during 2013 - 2015 from different parts of Victoria was considered. The study area for the considered study period had reasonably good air quality and relatively humid environment. The tree-based models were tuned using random grid search optimization with bootstrapping to address over-fitting. The models considered performed well in predicting asthma attacks in terms of area under the receiver operating curve (ROC AUC) (>0.82). All the gradient boosted trees (accuracy = 76%; recall = 63%; F2-score = 64%) showed better overall prediction whereas decision tree (accuracy = 71%; recall = 75%; F2-score = 71%) outperformed other models in identifying the positive cases. Tree-based models revealed that O3 exposure consistently influence Asthma. Further, decision tree revealed O3 exposure < 13 ppb or with high O3 exposure >= 13 ppb, and with [SO2 exposure < 0.5 ppb and maximum wind speed > 5.4. km/hr.] influenced Asthma. In addition, relative humidity and exposure to CO were also detected in other tree-based models as relevant predictors triggering asthma attacks.

Original languageEnglish
Title of host publicationProceedings of 2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML 2021), Chengdu, China, July 16-18, 2021
PublisherIEEE
Pages136-142
Number of pages7
ISBN (Print)9781665443838
DOIs
Publication statusPublished - 16 Jul 2021
EventInternational Conference on Pattern Recognition and Machine Learning -
Duration: 16 Jul 2021 → …

Conference

ConferenceInternational Conference on Pattern Recognition and Machine Learning
Period16/07/21 → …

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Fingerprint

Dive into the research topics of 'Tree-based models using random grid search optimization for disease classification based on environmental factors : a case study on asthma hospitalizations'. Together they form a unique fingerprint.

Cite this