On the Similarity between Hidden Layers of Pruned and Unpruned Convolutional Neural Networks

Zullich, Marco; Pellegrino, Felice; Medvet, Eric; Ansuini, Alessio

doi:10.5220/0008960300520059

During the last few decades, artificial neural networks (ANN) have achieved an enormous success in regression and classification tasks. The empirical success has not been matched with an equally strong theoretical understanding of such models, as some of their working principles (training dynamics, generalization properties, and the structure of inner representations) still remain largely unknown. It is, for example, particularly difficult to reconcile the well known fact that ANNs achieve remarkable levels of generalization also in conditions of severe over-parametrization. In our work, we explore a recent network compression technique, called Iterative Magnitude Pruning (IMP), and apply it to convolutional neural networks (CNN). The pruned and unpruned models are compared layer-wise with Canonical Correlation Analysis (CCA). Our results show a high similarity between layers of pruned and unpruned CNNs in the first convolutional layers and in the fully-connected layer, while for the intermediate convolutional layers the similarity is significantly lower. This suggests that, although in intermediate layers representation in pruned and unpruned networks is markedly different, in the last part the fully-connected layers act as pivots, producing not only similar performances but also similar representations of the data, despite the large difference in the number of parameters involved.