Loss Landscape Response to Adversarial Perturbation Is Architecture-Dependent:

Comparing ViT and ResNet

Shivam Dubey, Jason Hoelscher-Obermaier

We investigate how adversarial perturbations alter the loss landscape geometry of Vision Transformer (ViT-B/16) and ResNet-18 by measuring curvature of the loss with respect to both inputs as well as model parameters (Hutchinson’s trace estimator, N=500 ImageNet samples, seven attack algorithms). We find strong architectural differences: ViT reacts similarly to all attack types with strongly negative curvature in both input and model parameter space, with the two remaining correlated (r=0.38–0.63). ResNet-18 exhibits distinct responses depending on attack type: boundary-seeking attacks (DeepFool, C&W) increase curvature in both spaces. Iterative lossmaximizing attacks (PGD, BIM, AutoAttack), on the other hand, reduce input-space curvature to near-zero while parameter-space curvature remains large but bifurcates into positive and negative (46–47% negative). This divergence between input space and parameter space curvature means that only analyzing one or the other is not sufficient for characterizing the adversarial response of ResNets.

Previous
Previous

Baker: Optimal Affine Activation Steering Methods for Unlearning

Next
Next

Qureshi and Griffith et al: The Case for ESM3 as a General-Purpose AI Model with Systemic Risk