Web-Based Data Collection and Quality Issues in Co-Authorship Network Analysis

In this contribution we discuss data quality issues related to the application of web scraping techniques to the Cineca IRIS platform to derive co-authorship data among Italian university scholars. First, a semi-automatic tool is adopted to retrieve metadata from the platform, then a disambinguation network-based approach is considered to deal with author name disambiguation. This combined procedure is used to derive the co-authorship relations among Italian academic statisticians on the basis of the publications they inserted in the IRIS system until 2017.

Web-Based Data Collection and Quality Issues in Co-Authorship Network Analysis / DE STEFANO, D., Fuccella, V., Zaccarin, S.. - ELETTRONICO. - (2019), pp. 811-815. (Smart Statistics for Smart Applications Milano June 18, 2019 – June 21, 2019).

Web-Based Data Collection and Quality Issues in Co-Authorship Network Analysis

Domenico De Stefano;Vittorio Fuccella;Susanna Zaccarin

2019-01-01

Abstract

In this contribution we discuss data quality issues related to the application of web scraping techniques to the Cineca IRIS platform to derive co-authorship data among Italian university scholars. First, a semi-automatic tool is adopted to retrieve metadata from the platform, then a disambinguation network-based approach is considered to deal with author name disambiguation. This combined procedure is used to derive the co-authorship relations among Italian academic statisticians on the basis of the publications they inserted in the IRIS system until 2017.

Scheda breve

Scheda completa

	Anno
	
				2019
			
	ISBN
	
				9788891915108
			
	URL
	
				https://it.pearson.com/content/dam/region-core/italy/pearson-italy/pdf/Dirigenti e istituzioni/ISTITUZIONI-HE-PDF-sis2019_V4.pdf
			
	Appare nelle tipologie:
	
				4.1 Contributo in convegno/congresso non pubblicato

File in questo prodotto:

File	Dimensione	Formato
Zaccarin_Web-Based Data Collection and Quality Issues.pdf Accesso chiuso Descrizione: contributo con frontespizio e indice del volume Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 1.53 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.53 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2946992

Citazioni

ND

ND

ND

social impact