The efficient approximation of time-variant outputs from high-fidelity numerical models is essential for sustainable groundwater management in coastal aquifers. While surrogate models are increasingly deployed to bypass the computational costs of simulation-optimization (S-O) under heterogeneity, their performance often degrades in high-dimensional input spaces. This study proposes a novel ensemble clustering framework integrated with a random forest (RF) surrogate model to optimize pumping strategies across extensive well networks. The framework utilizes MODFLOW and SEAWAT to generate a foundational dataset of hydraulic drawdown and saltwater intrusion (SI) distributions. A primary innovation lies in our clustering-based dimensionality reduction, which effectively reduces 52 physical pumping wells to 10 representative proxy wells. This strategy significantly reduces input dimensionality while identifying near-optimal pumping patterns. To train the RF model, targeted SEAWAT simulations were subsequently implemented to generate 5200 training samples, for each of which 100 realizations of hydraulic conductivity fields are generated. Results indicate that this integrated clustering-RF approach achieves 95% computational savings over traditional surrogate-numerical hybrids. This efficiency is realized through a drastic reduction in input variables via well-field classification and focused sampling near optimal extraction patterns. The resulting scalable framework provides a robust tool for decision-makers managing complex, saltwater-intruded aquifer systems.
A clustering‑based surrogate model leveraging random forest for rapid pumping optimization in saltwater‑intruded heterogeneous aquifers / Cherubini, C.. - In: ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL. - ISSN 1614-7499. - (2026), pp. 1-24.
A clustering‑based surrogate model leveraging random forest for rapid pumping optimization in saltwater‑intruded heterogeneous aquifers
Claudia Cherubini
2026-01-01
Abstract
The efficient approximation of time-variant outputs from high-fidelity numerical models is essential for sustainable groundwater management in coastal aquifers. While surrogate models are increasingly deployed to bypass the computational costs of simulation-optimization (S-O) under heterogeneity, their performance often degrades in high-dimensional input spaces. This study proposes a novel ensemble clustering framework integrated with a random forest (RF) surrogate model to optimize pumping strategies across extensive well networks. The framework utilizes MODFLOW and SEAWAT to generate a foundational dataset of hydraulic drawdown and saltwater intrusion (SI) distributions. A primary innovation lies in our clustering-based dimensionality reduction, which effectively reduces 52 physical pumping wells to 10 representative proxy wells. This strategy significantly reduces input dimensionality while identifying near-optimal pumping patterns. To train the RF model, targeted SEAWAT simulations were subsequently implemented to generate 5200 training samples, for each of which 100 realizations of hydraulic conductivity fields are generated. Results indicate that this integrated clustering-RF approach achieves 95% computational savings over traditional surrogate-numerical hybrids. This efficiency is realized through a drastic reduction in input variables via well-field classification and focused sampling near optimal extraction patterns. The resulting scalable framework provides a robust tool for decision-makers managing complex, saltwater-intruded aquifer systems.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


