Large-scale analysis of Scanning Electron Microscopy (SEM) images is often limited by unreliable scale information due to proprietary formats and error-prone Optical Character Recognition (OCR). We address this by fine-tuning a Vision Transformer (ViT) to classify the image pixel size - pico, nano, or micro - directly from the pixel data. Fine-tuning on a dataset of 17,700 SEM images, the model achieves 90.6% precision on a held-out test set. Notably, most misclassifications occur at the transitional nano-micro boundary, indicating that the model learns physically meaningful feature representations. Our method provides a robust, automated tool for stratifying SEM archives, enabling new large-scale studies in materials science and nanotechnology.
Automated Pixel-Scale Classification of Scanning Electron Microscopy Images via Vision Transformers
Tommaso Rodani
Co-primo
;
2025-01-01
Abstract
Large-scale analysis of Scanning Electron Microscopy (SEM) images is often limited by unreliable scale information due to proprietary formats and error-prone Optical Character Recognition (OCR). We address this by fine-tuning a Vision Transformer (ViT) to classify the image pixel size - pico, nano, or micro - directly from the pixel data. Fine-tuning on a dataset of 17,700 SEM images, the model achieves 90.6% precision on a held-out test set. Notably, most misclassifications occur at the transitional nano-micro boundary, indicating that the model learns physically meaningful feature representations. Our method provides a robust, automated tool for stratifying SEM archives, enabling new large-scale studies in materials science and nanotechnology.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


