RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE. Works about adversarial attacks have been published to test their generalization proprieties and robustness. In this study, we propose Evolutionary Fooling Sentences Generator (EFSG), a black-box task-agnostic adversarial attack algorithm designed in an evolutionary fashion to generate false-positive sentences for binary classification tasks. We successfully apply EFSG to single-sentence (CoLA) and sentence-pair (MRPC) classification tasks, on BERT and RoBERTa. Results prove the presence of weak spots in state-of-the-art LMs. To complete the analysis, we perform transferability tests and ablation study. Finally, adversarial training helps as a data augmentation defence approach against EFSG, obtaining stronger improved models with no loss of accuracy.

EFSG: Evolutionary Fooling Sentences Generator

Di Giovanni M.;Brambilla M.

2021-01-01

Abstract

Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE. Works about adversarial attacks have been published to test their generalization proprieties and robustness. In this study, we propose Evolutionary Fooling Sentences Generator (EFSG), a black-box task-agnostic adversarial attack algorithm designed in an evolutionary fashion to generate false-positive sentences for binary classification tasks. We successfully apply EFSG to single-sentence (CoLA) and sentence-pair (MRPC) classification tasks, on BERT and RoBERTa. Results prove the presence of weak spots in state-of-the-art LMs. To complete the analysis, we perform transferability tests and ablation study. Finally, adversarial training helps as a data augmentation defence approach against EFSG, obtaining stronger improved models with no loss of accuracy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Titolo del libro
	
				Proceedings - 2021 IEEE 15th International Conference on Semantic Computing, ICSC 2021
			
	Titolo della collana
	
				PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING
			
	ISBN (International Standard Book Number)
	
				978-1-7281-8899-7
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
09364637.pdf Accesso riservato Descrizione: Articolo principale : Publisher’s version Dimensione 448.87 kB Formato Adobe PDF Visualizza/Apri	448.87 kB	Adobe PDF	Visualizza/Apri
11311-1169928_Di Giovanni.pdf accesso aperto : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 459.69 kB Formato Adobe PDF Visualizza/Apri	459.69 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1169928

Citazioni

ND

2

2

social impact