RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The presence of accounts managed by cybersecurity experts, professionals, and organizations makes social media a valuable source for computer security awareness. By regularly capturing and analyzing the posts on emerging cyber threats, individuals and organizations can understand potential dangers in a timely manner and effectively implement mitigation strategies. However, retrieving relevant and informative posts from a social network is challenging due to the high percentage of posts containing uninformative content. This paper proposes a novel approach based on supervised classifiers for selecting relevant social media posts and categorizing them according to different types of vulnerabilities. To accomplish this task, we designed a pipeline combining text classifiers in cascade, training them on manually labelled data. We analyzed various neural network techniques, leveraging language-agnostic sentence-level embeddings and past user activity, validating these techniques in a cross-validation setup. With an achieved accuracy of 87%, our approach offers effective filtering and classification of social media posts, empowering cybersecurity professionals to stay informed and take appropriate measures.

Improving Cybersecurity Awareness: Tweet Classification using Multilingual Sentence Embeddings and Contextual Features

Cotov A.;Bono C.;Cappiello C.;Pernici B.

2023-01-01

Abstract

The presence of accounts managed by cybersecurity experts, professionals, and organizations makes social media a valuable source for computer security awareness. By regularly capturing and analyzing the posts on emerging cyber threats, individuals and organizations can understand potential dangers in a timely manner and effectively implement mitigation strategies. However, retrieving relevant and informative posts from a social network is challenging due to the high percentage of posts containing uninformative content. This paper proposes a novel approach based on supervised classifiers for selecting relevant social media posts and categorizing them according to different types of vulnerabilities. To accomplish this task, we designed a pipeline combining text classifiers in cascade, training them on manually labelled data. We analyzed various neural network techniques, leveraging language-agnostic sentence-level embeddings and past user activity, validating these techniques in a cross-validation setup. With an achieved accuracy of 87%, our approach offers effective filtering and classification of social media posts, empowering cybersecurity professionals to stay informed and take appropriate measures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo del libro
	
				Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
			
	Parole chiave
	
				machine learning
posts classification
security vulnerabilities
social media analysis
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1261163

Citazioni

ND

1

ND

social impact