Intense label-free surface-enhanced Raman scattering (SERS) spectra of serum samples were rapidly obtained on Ag plasmonic paper substrates upon 785 nm excitation. Spectra from the hepatocellular carcinoma (HCC) patients showed consistent differences with respect to those of the control group. In particular, uric acid was found to be relatively more abundant in patients, while hypoxanthine, ergothioneine, and glutathione were found as relatively more abundant in the control group. A repeated double cross-validation (RDCV) strategy was applied to optimize and validate principal component analysis-linear discriminant analysis (PCA-LDA) models. An analysis of the RDCV results indicated that a PCA-LDA model using up to the first four principal components has a good classification performance (average accuracy was 81%). The analysis also allowed confidence intervals to be calculated for the figures of merit, and the principal components used by the LDA to be interpreted in terms of metabolites, confirming that bands of uric acid, hypoxanthine, ergothioneine, and glutathione were indeed used by the PCA-LDA algorithm to classify the spectra.

Repeated double cross-validation applied to the PCA-LDA classification of SERS spectra: a case study with serum samples from hepatocellular carcinoma patients

Gurian E.
Conceptualization
;
Di Silvestre A.;Mitri E.;Pascut D.;Tiribelli C.;Croce L. S.;Sergo V.
Writing – Review & Editing
;
Bonifacio A.
Writing – Original Draft Preparation
2021

Abstract

Intense label-free surface-enhanced Raman scattering (SERS) spectra of serum samples were rapidly obtained on Ag plasmonic paper substrates upon 785 nm excitation. Spectra from the hepatocellular carcinoma (HCC) patients showed consistent differences with respect to those of the control group. In particular, uric acid was found to be relatively more abundant in patients, while hypoxanthine, ergothioneine, and glutathione were found as relatively more abundant in the control group. A repeated double cross-validation (RDCV) strategy was applied to optimize and validate principal component analysis-linear discriminant analysis (PCA-LDA) models. An analysis of the RDCV results indicated that a PCA-LDA model using up to the first four principal components has a good classification performance (average accuracy was 81%). The analysis also allowed confidence intervals to be calculated for the figures of merit, and the principal components used by the LDA to be interpreted in terms of metabolites, confirming that bands of uric acid, hypoxanthine, ergothioneine, and glutathione were indeed used by the PCA-LDA algorithm to classify the spectra.
Pubblicato
https://link.springer.com/article/10.1007/s00216-020-03093-7
File in questo prodotto:
File Dimensione Formato  
Gurian2021_Article_RepeatedDoubleCross-validation.pdf

accesso aperto

Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 1.57 MB
Formato Adobe PDF
1.57 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11368/2993032
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 10
social impact