I highlight a problem that has become ubiquitous in scientific applications of machine learning and can lead to seriously distorted inferences. I call it the Prediction-Explanation Fallacy. The fallacy occurs when researchers use prediction-optimized models for explanatory purposes, without considering the relevant tradeoffs. This is a problem for at least two reasons. First, predictionoptimized models are often deliberately biased and unrealistic in order to prevent overfitting. In other cases, they have an exceedingly complex structure that is hard or impossible to interpret. Second, different predictive models trained on the same or similar data can be biased in different ways, so that they may predict equally well but suggest conflicting explanations. Here I introduce the tradeoffs between prediction and explanation in a non-technical fashion, present illustrative examples from neuroscience, and end by discussing some mitigating factors and methods that can be used to limit the problem.
The prediction-explanation fallacy: A pervasive problem in scientific applications of machine learning.
marco del giudice
Primo
2024-01-01
Abstract
I highlight a problem that has become ubiquitous in scientific applications of machine learning and can lead to seriously distorted inferences. I call it the Prediction-Explanation Fallacy. The fallacy occurs when researchers use prediction-optimized models for explanatory purposes, without considering the relevant tradeoffs. This is a problem for at least two reasons. First, predictionoptimized models are often deliberately biased and unrealistic in order to prevent overfitting. In other cases, they have an exceedingly complex structure that is hard or impossible to interpret. Second, different predictive models trained on the same or similar data can be biased in different ways, so that they may predict equally well but suggest conflicting explanations. Here I introduce the tradeoffs between prediction and explanation in a non-technical fashion, present illustrative examples from neuroscience, and end by discussing some mitigating factors and methods that can be used to limit the problem.File | Dimensione | Formato | |
---|---|---|---|
11235-Article-117561-2-10-20240322.pdf
accesso aperto
Descrizione: articolo
Tipologia:
Documento in Versione Editoriale
Licenza:
Creative commons
Dimensione
825.39 kB
Formato
Adobe PDF
|
825.39 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.