Machine learning has emerged as a powerful tool in atomistic simulations, enabling the identification of complex patterns in molecular systems, limiting human intervention and bias. However, the practical implementation of these methods presents significant technical challenges, particularly in the selection of hyperparameters and in the physical interpretability of machine-learned descriptors. In this work, we systematically investigate these challenges by applying an unsupervised learning protocol to a fundamental problem in physical chemistry, namely, how ions perturb the local structure of water. Using the Smooth Overlap of Atomic Positions (SOAP) descriptors, we demonstrate how the intrinsic dimension (ID) serves as a guide for selecting hyperparameters and interpreting the structural complexity. Furthermore, we construct a high-dimensional free-energy landscape encompassing all water environments surrounding different ions. This analysis reveals how the physical properties of ions are intricately reflected in their hydration shells, shaping the landscape through specific connections between different minima. Our findings highlight the difficulty in balancing algorithmic automation with the need of employing both physical and chemical intuition, particularly for the construction of meaningful descriptors and for the interpretation of final results. By critically assessing the methodological hurdles associated with unsupervised learning, we provide a roadmap for researchers looking to harness these techniques for studying electrolyte and aqueous solutions in general.

Opportunities and Challenges in Unsupervised Learning: The Case of Aqueous Electrolyte Solutions

Alex Rodriguez
;
2025-01-01

Abstract

Machine learning has emerged as a powerful tool in atomistic simulations, enabling the identification of complex patterns in molecular systems, limiting human intervention and bias. However, the practical implementation of these methods presents significant technical challenges, particularly in the selection of hyperparameters and in the physical interpretability of machine-learned descriptors. In this work, we systematically investigate these challenges by applying an unsupervised learning protocol to a fundamental problem in physical chemistry, namely, how ions perturb the local structure of water. Using the Smooth Overlap of Atomic Positions (SOAP) descriptors, we demonstrate how the intrinsic dimension (ID) serves as a guide for selecting hyperparameters and interpreting the structural complexity. Furthermore, we construct a high-dimensional free-energy landscape encompassing all water environments surrounding different ions. This analysis reveals how the physical properties of ions are intricately reflected in their hydration shells, shaping the landscape through specific connections between different minima. Our findings highlight the difficulty in balancing algorithmic automation with the need of employing both physical and chemical intuition, particularly for the construction of meaningful descriptors and for the interpretation of final results. By critically assessing the methodological hurdles associated with unsupervised learning, we provide a roadmap for researchers looking to harness these techniques for studying electrolyte and aqueous solutions in general.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3117672
 Avviso

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact