Regional flood frequency analysis using generalized additive models, random forest, and extreme gradient boosting for South-East Australia

Xiao Pan, Gokhan Yildirim, Ataur Rahman, Taha B.M.J. Ouarda

Research output: Contribution to journalArticlepeer-review

Abstract

This study develops a new regional flood frequency analysis (RFFA) model using Generalized Additive Models (GAM), Random Forest (RF), and XGBoost (XG) within the Peaks Over Threshold (POT) modelling framework. These machine learning techniques attempt to overcome the limitations associated with traditional linear regression-based RFFA models by better capturing complexity in non-linear rainfall-runoff process. Analysing data from 145 catchments in south-east Australia, we assess each of three model’s ability to predict flood quantiles across various return periods. GAM is found to be superior in accuracy, with a median absolute relative error of 33%, compared to 37% for RF and 40% for XG. Spatial analysis shows GAM’s robustness, significantly reducing errors in regions with high stream densities. It is also found that RF and XG models tend to overestimate flood quantiles in catchments with high stream densities. This research demonstrates that the integration of advanced machine learning methods within the POT framework significantly enhances the accuracy of flood quantile estimation, supporting more resilient flood risk management and infrastructure planning in flood affected regions. The findings of this study will assist upgrading Australian Rainfall and Runoff (ARR) – the national guideline. Unlike prior POT-RFFA studies based on linear/regularised regressions (and AM/GEV-focused GAM/ML work), we provide the first comprehensive comparison of GAM, RF, and XGBoost in a POT framework across 12EY–10ARI, with consistent cross-validation and spatial error diagnostics for SE Australia.

Original languageEnglish
Article number67
Number of pages22
JournalEnvironmental Earth Sciences
Volume85
Issue number2
DOIs
Publication statusPublished - Jan 2026

Keywords

  • ARR
  • Generalized additive model (GAM)
  • Machine learning
  • Peaks over threshold (POT)
  • Random forest
  • XGBoost

Fingerprint

Dive into the research topics of 'Regional flood frequency analysis using generalized additive models, random forest, and extreme gradient boosting for South-East Australia'. Together they form a unique fingerprint.

Cite this