Opportunities and Challenges in Unsupervised Learning: The Case of Aqueous Electrolyte Solutions

Sormani, Giulia; Rodriguez, Alex; Hassanali, Ali

doi:10.1021/acs.jctc.5c00449

Machine learning has emerged as a powerful tool in atomistic simulations, enabling the identification of complex patterns in molecular systems, limiting human intervention and bias. However, the practical implementation of these methods presents significant technical challenges, particularly in the selection of hyperparameters and in the physical interpretability of machine-learned descriptors. In this work, we systematically investigate these challenges by applying an unsupervised learning protocol to a fundamental problem in physical chemistry, namely, how ions perturb the local structure of water. Using the Smooth Overlap of Atomic Positions (SOAP) descriptors, we demonstrate how the intrinsic dimension (ID) serves as a guide for selecting hyperparameters and interpreting the structural complexity. Furthermore, we construct a high-dimensional free-energy landscape encompassing all water environments surrounding different ions. This analysis reveals how the physical properties of ions are intricately reflected in their hydration shells, shaping the landscape through specific connections between different minima. Our findings highlight the difficulty in balancing algorithmic automation with the need of employing both physical and chemical intuition, particularly for the construction of meaningful descriptors and for the interpretation of final results. By critically assessing the methodological hurdles associated with unsupervised learning, we provide a roadmap for researchers looking to harness these techniques for studying electrolyte and aqueous solutions in general.

Opportunities and Challenges in Unsupervised Learning: The Case of Aqueous Electrolyte Solutions / Sormani, G., Rodriguez, A., Hassanali, A.. - In: JOURNAL OF CHEMICAL THEORY AND COMPUTATION. - ISSN 1549-9626. - 21:16(2025), pp. 8060-8072. [10.1021/acs.jctc.5c00449]