Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE. Works about adversarial attacks have been published to test their generalization proprieties and robustness. In this study, we propose Evolutionary Fooling Sentences Generator (EFSG), a black-box task-agnostic adversarial attack algorithm designed in an evolutionary fashion to generate false-positive sentences for binary classification tasks. We successfully apply EFSG to single-sentence (CoLA) and sentence-pair (MRPC) classification tasks, on BERT and RoBERTa. Results prove the presence of weak spots in state-of-the-art LMs. To complete the analysis, we perform transferability tests and ablation study. Finally, adversarial training helps as a data augmentation defence approach against EFSG, obtaining stronger improved models with no loss of accuracy.
EFSG: Evolutionary Fooling Sentences Generator
Di Giovanni M.;Brambilla M.
2021-01-01
Abstract
Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE. Works about adversarial attacks have been published to test their generalization proprieties and robustness. In this study, we propose Evolutionary Fooling Sentences Generator (EFSG), a black-box task-agnostic adversarial attack algorithm designed in an evolutionary fashion to generate false-positive sentences for binary classification tasks. We successfully apply EFSG to single-sentence (CoLA) and sentence-pair (MRPC) classification tasks, on BERT and RoBERTa. Results prove the presence of weak spots in state-of-the-art LMs. To complete the analysis, we perform transferability tests and ablation study. Finally, adversarial training helps as a data augmentation defence approach against EFSG, obtaining stronger improved models with no loss of accuracy.File | Dimensione | Formato | |
---|---|---|---|
09364637.pdf
Accesso riservato
Descrizione: Articolo principale
:
Publisher’s version
Dimensione
448.87 kB
Formato
Adobe PDF
|
448.87 kB | Adobe PDF | Visualizza/Apri |
11311-1169928_Di Giovanni.pdf
accesso aperto
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
459.69 kB
Formato
Adobe PDF
|
459.69 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.