@wuxin1994 2017-09-07T16:00:36.000000Z 字数 4591 阅读 1051

# 对抗攻击论文总结

Secure

## Attack

1.传统的梯度下降，牛顿法，BFGS，L-BFGS
2.Jacobian saliency map attack (JSMA) ：《The limitations of deep learning in adversarial settings
2.5 FGSM：《Explaining And Harnessing Adversarial Examples》
iterative version of FGSM：《Adversarial examples in the physical world》(smaller perturbation)
3.RP2： 《Robust Physical-World Attacks on Machine Learning Models》
4.Papernot Method：《 Adversarial perturbations against deep neural networks for malware classification》
5.Universal Perturbations （extend of DeepFool method）：《Analysis of universal adversarial perturbations》《Universal adversarial perturbations》（思考：能否根据universal原理将神经网络中的输入都加上同一个扰动，让对应的模型分类效果更好，从而得出置信度更高的分类结果
6.DeepFool：《Deepfool: a simple and accurate method to fool deep neural networks. 》the first method to compute and apply the minimal perturbation necessary for misclassification under the L2 norm.（the approximation is more accurate than FGSM and faster than JSMA）（still computationally expensive）
7.《Towards evaluating the robustness of neural networks》（The authors cast the formulation of Szegedy et al. into a more efficient optimization problem, which allows them to craft efficient adversarial samples with low distortion.）（also very expensive）
8. Virtual adversarial examples：《Virtual adversarial training: a regularization method for supervised and semi-supervised learning》

，（自动汽车）《Concrete Problems for Autonomous Vehicle Safety:Advantages of Bayesian Deep Learning》，（恶意软件分类）《Adversarial Perturbations Against Deep Neural Networks for Malware Classification》

《Adversarial Attacks on Image Recognition》提到可以通过PCA降维处理数据

## Defence

defence的两个方向： 《Adversarial Attacks on Neural Network Policies》（作为future work提到）
1. 将对抗样本加入到训练集中。即是可以手动生成对抗样本，并加入到训练集中。（但是生成对抗样本的代价比较大）
2. 在测试模型时增加一个探测对抗输入的模块，判断输入是否有对抗攻击
Defence的方法：

1. Adversarial Training（augmenting the training data with perturbed examples）：《Intriguing properties of neural networks》（either feeding a model with both true and adversarial examples or learning it using the following modified objective function:
J ˆ(θ, x, y) = αJ(θ, x, y) + (1 − α)J(θ, x + ∆x, y)）

2. Defensive distillation：《Distillation as a defense to adversarial perturbations against deep neural networks》--hardens the model in two steps: first, a classification model is trained and its softmax layer is smoothed by division with a constant T ; then, a second model is trained using the same inputs, but instead of feeding it the original labels, the probability vectors from the last layer of the first model are used as soft targets. （《Adversarial perturbations of deep neural networks》对这种方法进行了改动，只需要一步即可构成攻击）

3. Feature squeezing：《Feature squeezing: Detecting adversarial examples in deep neural networks》《Feature squeezing mitigates and detects carlini/wagner adversarial examples》

4. Detection systems:
performe statistical tests:《On the (statistical) detection of adversarial examples》
use an additional model for detection:《Adversarial and clean data are not twins》《On detecting adversarial perturbations》
apply dropout at test time:《Detecting adversarial samples from artifacts》

5. PCA whitening ：《Early methods for detecting adversarial images》

## 研究点：

1. 各种攻击方式的优化
2. defence策略的构建
3. 针对各种特定应用场景的对抗
4. 利用攻击和defence策略优化现有模型结构，增强机器学习模型的效果
5. 探究对抗攻击深层原理，理解其背后的数学本质，实际上就是理解深度神经网络的工作原理以及对抗攻击为什么能有作用。
思考：对抗攻击的问题，好像神经网络泛化性能的限制问题，因为增加的很小的扰动就能让模型错误率比较高，说明模型的泛化能力不够强，针对其他样本的效果也不是很好，因此，增强神经网络模型的泛化能力也是defence策略的一个方向，尤其是在黑盒攻击方面

## 意义 ## (citing from )

1. Able to handle massive volumes of data
2. Works at machine speed to thwart attacks
3. Does not rely on signatures
4. Can stop known and unknown malware
5. Stops malware pre-execution
6. Higher detection retes, lower false positives

• 私有
• 公开
• 删除