@wuxin1994 2017-11-16T14:10:41.000000Z 字数 4062 阅读 1167

吴帆1116

学习笔记17

和老师讨论了几个论文中的问题，考虑contribution的点

 问题1：针对决策边界构造的对抗样本，在进行evasion attack时利用了神经网络模型的泛化能力不够的缺陷，因此才能让模型分类/探测准确率降低。但是，神经网络模型本身就很难抓住每一个样本的特征，因为数据集本来可能就没有一个完全确定的概率分布。每个样本数据都可能叠加了独一无二的高斯噪声。因此如何平衡增强神经网络模型的泛化能力和样本本身的独特性，从而规避对抗攻击的风险也是一个问题，目前也没有看到类似的研究。
    问题2：很多文章提出了，攻击者对目的模型具有不同的knowledge，攻击能力也不尽相同。但是目前还没有看到一篇探究这些因素（比如其他条件一样的情况下，单独考虑攻击者是否知道模型的特征选择）各自的影响的研究。因此，可以用类似控制变量法的方式，看哪个因素对攻击效果的影响最大。
    问题3：目前对抗样本的构造过程，常常是解决一个优化问题。但是，因为目标函数常常是非凸的，因此很难构造一个最优对抗攻击。如何解决非凸目标函数的优化问题，也是一个问题。《Towards Deep Learning Models Resistant to Adversarial Attacks》中尽管提出了一个鞍点公式，但是也不是万能的，在CIFAR数据集上测试时效果不理想。
    问题4：对比看了几篇将对抗攻击应用在实际场景中的论文，比如《NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles》和《Concrete Problems for Autonomous Vehicle Safety: Advantages of Bayesian Deep Learning》，《Adversarial examples in the physical world》和《Standard detectors aren’t (currently) fooled by physical adversarial stop signs》，发现目前的研究还是没有明确是否添加了对抗扰动的对抗样本是否会在实际应用中发挥作用，这种将构造出的对抗样本直接打印出来，通过照片等方式作为模型的输入时是否有效，还有通过实验研究的空间。

整理了完整版的outline，继续文章的总结和整理

1.Introduction
1.1 Machine learning
1.2 The security of machine learning
1.3 Definition of adversarial attack
1.3.1 机器学习和深度学习背景及应用（background and application of machine learning and deep learning）
1.3.2 实际中的安全隐患（potential safety hazard of machine learning）
1.3.2.1 机器学习模型没有充分考虑实际场景
1.3.2.2 深度学习框架中实现漏洞
1.3.2.3 对抗攻击者的恶意样本攻击
1.3.3 引入对抗攻击和防御策略的概念（introduction of adversarial attack and defensive policy）
1.3.3.1 安全隐患不容忽视的原因
1.3.3.2 让模型更加鲁棒的流程
1.3.3.3 引入对抗攻击概念及解决机器学习安全问题的困难点

2 Adversarial attack
2.1 对抗攻击的原因分析和分类（principle analyze and classification of adversarial attack）
2.1.1 对抗攻击原理
2.1.1.1 线性特征
2.1.1.2 非线性特征
2.1.2 黑盒攻击和白盒攻击
2.1.2.1 区分神经网络模型的依据
2.1.2.2 黑盒攻击和白盒攻击的区别和联系
2.1.3 目标攻击和非目标攻击
2.1.4 exploratory attack，evasion attack，poisoning attack3
2.2 按类别介绍经典对抗攻击算法研究
2.2.1 exploratory attack
2.2.1.1 Model inversion
2.2.1.2 Inferring useful information（inference attack）
2.2.1.3 Model extraction attack using online APIs（black box attack、API attack）
2.2.2 evasion attack
2.2.2.1 gradient based attack
2.2.2.2 universal perturbation
2.2.2.3 GAN based attack
2.2.3 poisoning attack
2.2.3.1 poisoning attack in early statistic machine learning model
2.2.3.2 simple machine learning model
2.2.3.3 deep learning model and neural network
2.3 Summary of some milestone research in adversarial attack
2.3.1 FGSM
2.3.2 JSMA
2.3.3 DeepFool
2.3.4 Universal perturbation
2.3.5 RP2
2.3.6 CW
2.3.7 GAN based attack
2.3.8 Transferability research of adversarial examples
...
2.4 application of adversarial attack
2.4.1 face recognition
2.4.2 physical photo detection
2.4.3 malware augment
2.4.4 autonomous vehicle attack

3 efficiency of adversarial attack evaluation method
3.2 Biggio evaluation method
3.3 Papernot evaluation method
3.4 Dezfooli evaluation method

4 Defensive policy
4.1 逃避攻击的防御
4.1.1 feature reduction techniques
4.1.1.1 对抗环境下的封装式特征选择算法（WAFS）
4.1.1.2 Domain invariant feature extraction
4.1.1.3 model an adversary as engaging in a sparse feature attack（sparse model）
4.1.1.4 Feature Cross-Substitution in Adversarial Classification
4.1.1.5 the binary version of particle swarm optimization (BPSO)
4.1.2 Multiple classifier systems
4.1.2.1 多分类器(MCS)、非线性分类器
4.1.2.2 One-and-a-Half-Class Multiple Classifier Systems for Secure Learning
4.1.3 Adversarial training
4.1.3.1 Adversarial training
4.1.3.2 Random Generated Malicious Samples
4.1.3.3 Virtual adversarial training
4.1.3.4 ensemble adversarial training
4.1.4 Classifier-specific
4.1.4.1 SVM
4.1.4.2 boosted tree model
4.1.4.3 DNNs
4.1.5 Detect evasion attacks
4.1.5.1 a approach to the detection of malicious PDF files
4.1.5.2 Feature Squeezing
4.1.6 Others
4.1.6.1 Data Transformations
4.1.6.2 Deep Contractive Network
4.1.6.3 GDA+BRELU
4.2 投毒攻击的防御
4.2.1 Data Sanitization
4.2.2 提高算法的鲁棒性
4.2.2.1 Bootstrap Aggregating
4.2.2.2 RSM（Random Subspace Method）
4.2.2.3 Antidote

5 模型鲁棒性评估综述
5.1 Reactive and Proactive Arms Races
5.1.1 reactive arms race
5.1.2 Proactive Arms Races
5.2 potential vulnerabilities of learning algorithms in adversarial environments
5.2.1 Support Vector Machines
5.2.2 random forest classifier
5.2.3 Neural Networks
5.3 frameworks for security evaluation
5.3.1 Modeling the Adversary
5.3.1.1 Adversary’s Goal
5.3.1.2 Adversary’s Knowledge
5.3.1.3 Adversary’s Capability
5.3.1.4 Attack Strategy
5.3.2 A Framework for Quantitative Security Analysis of Machine Learning

6 promising direction in attack algorithm
6.1 universal perturbation attack
6.2 combination of adversarial attack and GAN
6.3 application in defensive policy construction
6.4 ……

吴帆1116

和老师讨论了几个论文中的问题，考虑contribution的点

整理了完整版的outline，继续文章的总结和整理

内容目录