Towards Minimally Perturbed Adversarial Images with l0 Approximation

In recent years, machine learning has become the de-facto standard for different human and computer tasks, spanning from pattern recognition, language understanding, detection of cyber-threats and many more disciplines. Although these models often provide the best results in the field, it has been shown that inputs formed by applying small, but deliberately worst-case perturbations, lead to the model outputting an incorrect answer. One way of crafting these minimally perturbed adversarial examples is by using gradient-based optimization algorithms coupled with the use distance metrics (e.g., lp norms) enforcing sparsity in the optimal solution. It has been discovered that the best metric for this purpose is the l0 norm, for which however optimization is NP-hard. In this work we try to bridge this gap and show that an approximation of the l0 norm can be exploited to craft powerful adversarial examples with minimal perturbations. We empirically demonstrate the effectiveness and suitability of the resulting attacks on two cutting-edge deep neural networks (i.e., ResNet18 and VGG16) trained on two different vision datasets (i.e., CIFAR10 and GTSRB). Finally, we compare the results of our attack with the state of the art by demonstrating that our attack offers a good trade-off between attack speed and effectiveness.

Towards Minimally Perturbed Adversarial Images with l0 Approximation

Villani, Francesco

2023/2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Informatica
			
	Anno Accademico
	
				2023-03-16
			
	Relatore
	
				Pelillo, Marcello
			
	Appare nelle tipologie:
	
				Laurea magistrale

File in questo prodotto:

File	Dimensione	Formato
867944-1270095.pdf Open Access dal 23/05/2024 Tipologia: Altro materiale allegato Dimensione 3.4 MB Formato Adobe PDF Visualizza/Apri	3.4 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/14207