The study shows the feasibility of predicting firms’ expenditures in innovation, as reported in the Community Innovation Survey, applying a supervised machine-learning approach on a sample of Italian firms. Using an integrated dataset of administrative records and balance sheet data, designed to include all informative variables related to innovation but also easily accessible for most of the cohort, random forest algorithm is implemented to obtain a classification model aimed to identify firms that are potential innovation performers. The performance of the classifier, estimated in terms of AUC, is 0.794. Although innovation investments do not always result in patenting, the model is able to identify 71.92% of firms with patents. More encouraging results emerge from the analysis of the inner working of the model: predictors identified as most important—such as firm size, sector belonging and investment in intangible assets—confirm previous findings of literature, but in a completely different framework. The outcomes of this study are considered relevant for both economic analysts, because it demonstrates the potential of data-driven models for understanding the nature of innovation behaviour, and practitioners, such as policymakers or venture capitalists, who can benefit by evidence-based tools in the decision-making process.

Can we predict firms’ innovativeness? The identification of innovation performers in an Italian region through a supervised learning approach

Gandin, Ilaria
;
Cozza, Claudio
2019

Abstract

The study shows the feasibility of predicting firms’ expenditures in innovation, as reported in the Community Innovation Survey, applying a supervised machine-learning approach on a sample of Italian firms. Using an integrated dataset of administrative records and balance sheet data, designed to include all informative variables related to innovation but also easily accessible for most of the cohort, random forest algorithm is implemented to obtain a classification model aimed to identify firms that are potential innovation performers. The performance of the classifier, estimated in terms of AUC, is 0.794. Although innovation investments do not always result in patenting, the model is able to identify 71.92% of firms with patents. More encouraging results emerge from the analysis of the inner working of the model: predictors identified as most important—such as firm size, sector belonging and investment in intangible assets—confirm previous findings of literature, but in a completely different framework. The outcomes of this study are considered relevant for both economic analysts, because it demonstrates the potential of data-driven models for understanding the nature of innovation behaviour, and practitioners, such as policymakers or venture capitalists, who can benefit by evidence-based tools in the decision-making process.
Pubblicato
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0218175
File in questo prodotto:
File Dimensione Formato  
journal.pone.0218175.pdf

accesso aperto

Tipologia: Documento in Versione Editoriale
Licenza: Creative commons
Dimensione 1.1 MB
Formato Adobe PDF
1.1 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11368/2971753
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact