title

date

comments

author

Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

現在虛假的影片被有心人士操弄的話會導致媒體的公信力持續下降，故檢測這些虛假的影片成為一種趨勢。最近開發的Deepfake偵測系統都是利用深度神經網路去判別真影像和假影像的AI工具。我們發現只要對現在的方法做修改就能很快速地令Deepfake系統判別fake為True

tags: `adversarial`

其網站連結

論文連結

Introduction

Deepfake 是一種把別人的臉換掉的一種技術， Face2Face (Thies et al., 2016), Neural Textures (Thies et al., 2019) and FaceSwap (Kowalski) 這些也都是換臉的一種技術。這些技術可能是無害的但是卻有可能被有心人士操弄。當然對於以上的攻擊也會有相對應的防禦出現透過 Convolutional Neural Networks (CNNs) (Dolhansky et al., 2019; Rossler et al., 2019; Afchar et al., 2018; Amerini et al., 2019; Li & Lyu, 2019; Rahmouni et al., 2017)（最基本的檢測方法就是把人臉剪裁下來後利用分類器去辨別真假。但現有的工作並不能足夠的代表檢測完美，還是有很大的機會遭到對抗性範例的影響。這裡我們提出的方法就是能夠繞過Deepfake檢測器，並且不只白盒，黑盒我們也可以做到。在最後我們給予了社會在檢測Deepfake的時候的一些規範。

Background with Related work

For Generating Manipulated Videos

There have several methods of using in follow:

FaceSwap(FS): 這個方法是graphics-based approach，透過稀疏的臉部特徵去找到臉部，並且計算最小化的投影影像到臉上。最後再將圖像混合，顏色矯正回去
Face2Face(F2F): 轉換一個人的表情到目標人的身上，但是不會換掉目標人，純粹表情轉換。
DeepFake: 通常是被視為一些利用深度學習方法去轉換臉的方法，也是一種特定方法的名字。這裡他們利用兩個互相共享encoder的auto-encoder去重建源臉和目標臉，所以為了創建假的影像他們把源臉丟入到目標臉的decoder.
NeuralTextures (NT): 這是一種利用GAN的技術所產生的面部重現技術，學習人的神經紋理。

Dataset : FaceForesics++ 是上述方法所創造的資料集。但我們也是利用上述資料集去做轉換面部的模型。

Detecting Manipulated Videos

Barniet al. (2018); B¨ohme & Kirchner (2013) : 基於最原本的方法(未使用深度學習改造圖片的方法)，是可以繞過檢測偽造圖像的偵測器。但最近的方法逐漸轉變為使用CNN的方式, 例如檢測面部交換(face swapping)(Zhou et al., 2017)，面部變形(face morphing)(Raghavendra et al., 2017)和剪接攻擊(splicing attacks)(Bappyet al., 2017; 2019). 其他也有發現例如眨眼或是頭部姿勢不一樣也能夠被檢測出是偽造的圖像(不過仍然也被破解了)。 The Deepfake detectors proposed in Rossler et al. (2019);Afchar et al. (2018)Dolhansky et al. (2019) model Deepfake detection 為每幀二分法的問題。透過一些技術在每幀時找到臉部特徵，將其剪裁下來後丟入分類模型去分類。目前最優秀的兩個模型分別為XceptionNet and MesoNet.

XceptionNet in 2017 model in 2019: 原始的文章是在2017年的時候提出來的模型架構，其中包含了depthwise separable convolutions and residual connections，加快了運算速度。並且再利用遷移學習去訓練可以看出其表現確實是不錯(比MesoNet 好)
MesoNet: Two structure Meso-4 and MesoInception-4, Meso-4是由四個CNN所組成的，MesoInception-4則是把前面兩層用Inception去換掉，最後結果呈現在下圖。

Adversarial Examples

對抗性示例就是故意設計給機器學習的輸入去判斷錯誤，使用這些的目的主要是要
(1) 最大程度的改變輸出的值。
(2) 將樣本推向低密度的區域。

針對這些的攻擊有人建議對輸入的圖像執行隨機的操作，例如隨機裁剪和JPEG壓縮，但仍然被破解一線對抗攻擊。

Methodology

我們使用兩步流水線的方式去執行：

我們使用了Face tracking model, 在每幀的時候抓取臉部的部分(為何只抓取臉部，因為比起抓取全部所耗費的時間少，性能也比較好)
將抓取的臉部做出一些更動後，丟入我們的CNN-based classification model.

我們的目的是讓被認為是假影像目標的更改為真影像，並且是不被觀察到的改動，Goodfellow et al.,2015這篇文章提到建議使用$L_\infty$來當作perturbation的限制，並且在計算速度上也比$L_2$ and $L_0$還要快。 $F(x)=softmax(Z(x))=y$, $C(x) = argmax_i(F(x)i)$ For $C(x_0) = False$, 我們想要做完更動後的$x{adv}$使得 $C(x_{adv})=Real$, and $||x_{adv}-x_0||<\epsilon$

接下來會展示兩種不同盒子的攻擊方法。

White box attack

Iterative gradient sign method:在這裡使用這篇文章的方法去優化損失函數(此loss function is fromCarlini & Wagner(2017))，使用其是因為他robust against defensive distillation，而$Z(X)_y$同上面所述，在softmax之前的classifier輸出。對於未壓縮的影像可以達到99.05%, 但是對於使用$MJPEG$壓縮方法的就沒有比較好的成效，需要使用其他方法來改善。 Input transformations $T$, input image $x$, target class $y$ Transformation 有幾種方式： (1)Gaussian Blur: Convolution of the original image with the Gaussian kernel k. $t(x)=k*x$ (2)Gaussian Noise Addition: Add the Gaussian noise $\theta \backsim N(0,\sigma)$. $t(x) = x+\theta$ (3)Translation: Pad the four sides by zeros and shift the pixels horizontally and vertically. $t(x) = x'[H,W,C]$, $x'[i,j,c]=x[i+t_x,j+t_y,c]$, $t_x,t_y$ is the transform in x-axis and y-axis,respectively. (4)Downsizing and Upsizing : Downsized by a factor $r$ and then up-sampled the same factor with using bilinear re-sampling. 由於需要利用轉換後的x來執行, 故我們需要使用新的loss function: 再根據大數法則

Black box attack

由於黑盒你只能知道最後的機率for real or fake, 所以演算法也需要做出改變。在這裡所使用的是自然進化策略(Natural Evolutionary Strategies-NES): 我們的目標就是讓input x，所需要對應的target label y, 最大化他的機率值for the search distribution $\pi(\theta|x)$。 Maximize : $E_{\pi(\theta|x)}[F(\theta)y]$ 並且與原論文相似 Our search distribution of random Gaussian noise around the current image x. $\theta = x + \sigma\delta$, where $\delta \backsim N(0,I)$ For n samples gradient estimate, . 並且使用對立抽樣抽取$\delta_i$, $\delta_i=-\delta{n-i+1}$,這有助於提升optimization的表現。

基本演算法1如下圖，並且其轉換函數為一個基本函數$T={I(x)}$. 接下來再把透過演算法1所算出來的輸出gradient丟入計算式內即可。與白盒一樣我們仍然需要為了壓縮的圖片去做處理，並且再把算出的梯度依樣利用算式3去算出來，到達最大的迭代次數或是中途攻擊成功即可。

Experiment

Dataset : Face-Forensics++ Dataset 70 videos(total 29,764 frames) Be attacked Model : XceptionNet and MesoNet The model detected accuracy 將圖片分為三種模式: (1) 未壓縮圖片 : 將每幀的影像變成圖像 (2) 壓縮圖片 : 將影像利用JPEG的方式壓縮 (3) 壓縮圖片 : 將影像轉成mp4的形式存起來，具有時間上的壓縮性質 Success Rate (SR): 成功變成真實圖像的機率, SR-U 未壓縮, SR-C 壓縮 Mean distortion($L_{\infty})$ : 平均失真度

For the setting, max iteration with 100, learning rate $\alpha=\frac{1}{255}$, max $L_{\infty}$ constraint $\epsilon = \frac{16}{255}$

White box attack

但我們可以看到說其實在壓縮圖片上對於圖片的攻擊成功率下降很多，故我們需要對其採用有壓縮方式的白盒攻擊方法，總共使用了12種樣本的轉換函數，4個不同函數會各取3個。使用過後可以看到攻擊率提升很多。接下來根據mp4形式的H.264壓縮方法的比較，c =40的時候達到80.39%和90.50%

Black box attack

我們可以看到說其實在壓縮圖片上對於圖片的攻擊成功率同樣下降很多，所以同樣使用更強壯的方法套上轉換方法後可以得到下圖的結果，對於XceptionNet有所提升，但對於MesoNet就沒有提升太多

Conclusion

對於檢測器來說，要如何說明其有強壯的魯棒性是很困難的。建議採取對抗性訓練(adversarial training)去訓練。

We recommend approaches similar to Adversarial Training (Goodfellow et al., 2015) to train robust Deepfake detectors.

That is, during training, an adaptive adversary continues to generate novel Deepfakes that can bypass the current state of the detector and the detector continues improving in order to detect the new Deepfakes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adversarial Deepfakes - Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples.md

Adversarial Deepfakes - Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples.md

Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

tags: `adversarial`

Introduction

Background with Related work

For Generating Manipulated Videos

Detecting Manipulated Videos

Adversarial Examples

Methodology

White box attack

Black box attack

Experiment

White box attack

Black box attack

Conclusion

Files

Adversarial Deepfakes - Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples.md

Latest commit

History

Adversarial Deepfakes - Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples.md

File metadata and controls

Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples

tags: adversarial

Introduction

Background with Related work

For Generating Manipulated Videos

Detecting Manipulated Videos

Adversarial Examples

Methodology

White box attack

Black box attack

Experiment

White box attack

Black box attack

Conclusion

tags: `adversarial`