Detection of Human, Legitimate Bot, and Malicious Bot in Online Social Networks Based on Wavelets

Social interactions take place in environments that influence people's behaviours and perceptions. Nowadays, the users of Online Social Network (OSN) generate a massive amount of content based on social interactions. However, OSNs wide popularity and ease of access created a perfect scenario to practice malicious activities, compromising their reliability. To detect automatic information broadcast in OSN, we developed a wavelet-based model that classifies users as being human, legitimate robot, or malicious robot, as a result of spectral patterns obtained from users' textual content. We create the feature vector from the Discrete Wavelet Transform along with a weighting scheme called Lexicon-based Coefficient Attenuation. In particular, we induce a classification model using the Random Forest algorithm over two real Twitter datasets. The corresponding results show the developed model achieved an average accuracy of 94.47% considering two different scenarios: single theme and miscellaneous one.

Detection of Human, Legitimate Bot, and Malicious Bot in Online Social Networks Based on Wavelets

Barbon Junior S;Campos GFC;Tavares GM;Igawa RA;Proenca ML;Guido RC

2018-01-01

Abstract

Social interactions take place in environments that influence people's behaviours and perceptions. Nowadays, the users of Online Social Network (OSN) generate a massive amount of content based on social interactions. However, OSNs wide popularity and ease of access created a perfect scenario to practice malicious activities, compromising their reliability. To detect automatic information broadcast in OSN, we developed a wavelet-based model that classifies users as being human, legitimate robot, or malicious robot, as a result of spectral patterns obtained from users' textual content. We create the feature vector from the Discrete Wavelet Transform along with a weighting scheme called Lexicon-based Coefficient Attenuation. In particular, we induce a classification model using the Random Forest algorithm over two real Twitter datasets. The corresponding results show the developed model achieved an average accuracy of 94.47% considering two different scenarios: single theme and miscellaneous one.

Scheda breve

Scheda completa

	Anno
	
				2018
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rivista
	
				ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS
			
	DOI
	
				https://dx.doi.org/10.1145/3183506
			
	URL
	
				https://dl.acm.org/doi/10.1145/3183506
			
	Appare nelle tipologie:
	
				1.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
jr2018.pdf Accesso chiuso Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 3.99 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.99 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
3004459_jr2018-Post_print.pdf Accesso chiuso Tipologia: Bozza finale post-referaggio (post-print) Licenza: Digital Rights Management non definito Dimensione 4.51 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	4.51 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3004459

Citazioni

ND

37

30

social impact