We focus on credit scoring methods to separate defaulter small and medium enterprises from non-defaulter ones. In this framework, a typical problem occurs because the proportion of defaulter firms is very close to zero, leading to a class imbalance problem. Moreover, a form of bias may affect the classification. In fact, classification models are usually based on balance sheet items of large corporations which are not randomly selected. We investigate how different criteria of sample selection may affect the accuracy of the classification and how this problem is strongly related to the imbalance of the classes.
Effect of training set selection when predicting defaulter SMEs with unbalanced data
TORELLI, Nicola;
2011-01-01
Abstract
We focus on credit scoring methods to separate defaulter small and medium enterprises from non-defaulter ones. In this framework, a typical problem occurs because the proportion of defaulter firms is very close to zero, leading to a class imbalance problem. Moreover, a form of bias may affect the classification. In fact, classification models are usually based on balance sheet items of large corporations which are not randomly selected. We investigate how different criteria of sample selection may affect the accuracy of the classification and how this problem is strongly related to the imbalance of the classes.File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.