# Adversary Resistant Deep Neural Networks with an Application to Malware Detection

2019-9-19

Qinglong Wang ,Wenbo Guo,Kaixuan Zhang,AlexanderG.OrorbiaII, Xinyu Xing,Xue Liu,C.LeeGiles
KDD 2017（CCF-A）

## Introduce

• deep neural networks(DNNs) could help turn the tide in the war against malware infection

• However, DNNs are vulnerable to adversarial samples

• Past research in developing defense mechanisms relies on strong assumptions,which typically do not hold in many real-world scenarios. Also,these proposed techniques can only be empirically validated and do not provide any theoretical guarantees. This is particularly disconcerting when they are applied to security-critical applications such as malware detection.

## Why It Works？

• 随机性的引入使得attackers不容易发现DNN的”blind spots”（也就是AEs）
• 这个adversary-resistant DNNs 只需要一点微小的工作，且可以维持分类的表现
• 从理论上来说，本文的方法可以保证对AE的抵抗性

• Data Augmentation

Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068 [cs] (2014).

Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization. arXiv:1601.07213 [cs] (2016).

增强数据，主要是通过将潜在AEs与普通样本进行训练（对抗训练）以增加对AEs的鲁棒性，对抗训练已被证明很有用

作者指出的问题：blind spots空间太大，不可能去覆盖一个infinite space，且attackers也可以对对抗训练模型本身进行攻击（how），考虑到无限空间，每一次遇到AEs就必须再次训练对抗训练模型，如此反复迭代

• Enhancing Model Complexity

增加模型复杂度

Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068 [cs] (2014).

Distillation as a defense to adversarial perturbations against deep neural networks. arXiv preprint arXiv:1511.04508 (2015). （防御蒸馏机制，将第一个深度神经网络输出的软标签输入到第二个网络中进行训练，降低模型对微小扰动的敏感度。第一个模型的软标签熵编码了类之间的相对差异）

作者指出的问题：攻击者可以使用两个近似性能的DNN来拟合整个机制（该论文作者承认了此机制很容易被拟合）

同时该机制实际上是一个梯度掩码模型，并无法抵抗JSMA的攻击

## Random Feature Nullification

we introduce random feature nullification to both the training and testing phases of DNN models, making the architectures non-deterministic.

## Model Description

$X\in R^{N*M}$ ($N$ 个样本，$M$维特征)

$\hat{I}_p\in R^{N*M}$ (mask matrix)

Nullification来源于$X$与$\hat{I}_p$按位乘

$⌈M·p^i⌉$ :$I_{p^i}$中随机分布的0的个数，$p^i$是从高斯分布$N(μ_p,\sigma^2_p)$中的一次采样样本

(⊙为Hadamard product，一种特殊的矩阵乘法，同阶矩阵，$c_{ij}=a_{ij}*b_{ij}$)

## Analysis: Model Resistance to Adversaries

where $J_L(q)=\partial L(f(x_i,I_{p^i};\theta),\hat{y})/\partial q(\hat{x},I_p)$，$I_p$是在测试中使用的mask matrix