发布时间: 2022-02-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.210546
2022 | Volume 27 | Number 2

三维形状分析

基于显著性图的点云替换对抗攻击

刘复昌¹, 南博¹, 缪永伟^1,2

1. 杭州师范大学信息科学与技术学院, 杭州 311121;

2. 浙江理工大学信息学院, 杭州 310018

收稿日期: 2021-07-05; 修回日期: 2021-09-14; 预印本日期: 2021-09-21

基金项目: 国家自然科学基金项目（61972458）；浙江省自然科学基金项目（LY20F020017）

作者简介: 刘复昌, 1982年生, 男, 副教授, 硕士生导师, 主要研究方向为计算机图形学、计算机视觉、深度学习。E-mail: liufc@hznu.edu.cn
南博, 男, 硕士研究生, 主要研究方向为计算机图形学、计算机视觉。E-mail: 2020111011008@stu.hznu.edu.cn
缪永伟, 通信作者, 男, 教授, 博士生导师, 主要研究方向为计算机图形学、数字几何处理、计算机视觉、机器学习。E-mail: ywmiao2009@hotmail.com
*通信作者: 缪永伟 ywmiao2009@hotmail.com

中图法分类号: TP37

文献标识码: A

文章编号: 1006-8961(2022)02-0500-11

摘要

目的传统针对对抗攻击的研究通常集中于2维图像领域，而对3维物体进行修改会直接影响该物体的3维特性，生成令人无法察觉的扰动是十分困难的，因此针对3维点云数据的对抗攻击研究并不多。点云对抗样本，如点云物体分类、点云物体分割等的深度神经网络通常容易受到攻击，致使网络做出错误判断。因此，提出一种基于显著性图的点云替换对抗攻击方法。方法由于现有点云分类网络通常需要获取点云模型中的关键点，该方法通过将点移动到点云中心计算点的显著性值，从而构建点云显著性图，选择具有最高显著性值的采样点集作为关键点集，以确保对网络分类结果造成更大的影响；利用Chamfer距离衡量点云模型之间的差异性，并选择与点云模型库中具有最近Chamfer距离的模型关键点集进行替换，从而实现最小化点云扰动并使得人眼难以察觉。结果使用ModelNet40数据集，分别在点云分类网络PointNet和PointNet++上进行对比实验。在PointNet网络上，对比FGSM（fast gradient sign method）、I-FGSM（iterative fast gradient sign method）和JSMA（Jacobian-based saliency map attack）方法，本文方法攻击成功率分别提高38.6%、7.3%和41%；若扰动100个采样点，本文方法将使网络准确率下降到6.2%。在PointNet++网络上，对比FGSM和JSMA，本文方法的攻击成功率分别提高58.6%和85.3%；若扰动100个采样点，本文方法将使网络准确率下降到12.8%。结论本文提出的点云对抗攻击方法，不仅考虑到对抗攻击的效率，而且考虑了对抗样本的不可察觉性，能够高效攻击主流的点云深度神经网络。

关键词

点云对抗攻击; 显著性图; Chamfer距离; PointNet; PointNet++

Point cloud replacement adversarial attack based on saliency map

Liu Fuchang¹, Nan Bo¹, Miao Yongwei^1,2

1. School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China;

2. College of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China

Supported by: National Natural Science Foundation of China(61972458); Natural Science Foundation of Zhejiang Province, China(LY20F020017)

Abstract

Objective Deep learning networks are vulnerable to attacks from well-crafted adversarial samples, resulting in neural networks that produce erroneous results. However, the current research on the adversarial attack is often focused on 2D images and convolutional neural network(CNN) networks. Therefore, research on 3D data such as point cloud is minimal. In recent years, deep learning has achieved great success in the application of 3D data. Considering many safety-critical applications in the field of 3D object classification, such as automatic driving, studying how the adversarial samples of point cloud affect the current 3D deep learning network is very important. Recently, researchers have made great progress on many tasks such as object classification and instance segmentation using deep neural networks on the point cloud. PointNet and PointNet++ are the classical representatives. Robustness against attacks has been studied rigorously in 3D deep learning because security has been becoming a vital role in deep learning systems. Many studies have shown that the deep neural network for processing 2D images is extremely weak against adversarial samples. In addition, most of defense methods have been defeated by adversarial attacks. For instance, fast gradient sign method (FGSM) is a very classical attack algorithm, which successfully enables a neural network to recognize a panda as a gibbon, whereas humans are not able to distinguish the difference between the two pictures before and after the attack. Subsequently, the iterative fast gradient sign method (I-FGSM) algorithm is proposed to improve the FGSM algorithm, making the attack more successful and more difficult to defend, and pointing out the difficulty of the challenge posed by adversarial attacks. An important concept is developed in PointNet. Authors of PointNet indicate that PointNet can correctly classify the network only through a subset of the point clouds, which affect the point cloud classification and are called the critical points. Moreover, the authors point out that the strong robustness of PointNet depends on the existence of the critical points. However, the theory of the critical point is still inadequate. The concept of the critical point is very vague because it does not provide the value of importance of each point and subset at all. Therefore, the point cloud saliency map is proposed to solve this problem well because the point cloud saliency map can estimate the importance of every single point. After the importance of each point is computed, the most important k points can be perturbed to generate countermeasure samples and realize the attack on the network. Method According to the basic fact of critical points that have been analyzed above, a point cloud saliency map is first built to enhance the effectiveness of attacks. In saliency map construction, iterative estimation of critical points is used to prevent dependencies between different points. After the saliency score of each point is estimated, the algorithm proposed in this paper perturbs the first k points with the highest saliency score. Specifically, k points with the highest saliency score are selected in the input point cloud and exchanged with the critical points which have the smallest chamfer distance. Chamfer distance is often used to measure the direct difference between two point clouds. The smaller the difference between point clouds is, the smaller the chamfer distance is, that is, point clouds with smaller chamfer distance appear more similar. The proposed method does not only limit the search space but also minimizes the disturbance of the point cloud. Therefore, the adversarial sample of the point cloud is not imperceptible to human eyes. Result The experiment is conducted on the Model-Net40 dataset, which has 40 categories of different objects. PointNet and PointNet++, the most popular point cloud classification models, are used as victim networks. Our method is compared with classical white box attack algorithms. Our attack is also validated with several classic defense algorithms. In the case of using PointNet, compared with FGSM, the attack success rate is increased by 38.6%. Similarly, compared with the Jacobian-based saliency map attack (JSMA), the attack success rate is increased by 7.3%. Compared with JSMA, the attack success rate is increased by 41%. Under the restriction of perturbation of 100 points, the network accuracy is reduced to 6.2%. When the random point drop algorithm is attacked, a success rate of 97.9% can still be achieved. When the outlier remove algorithm is attacked, a success rate of 98.6% can be achieved. In the case of using PointNet++, compared with FGSM, the attack success rate is increased by 58.6%, and the attack success rate is increased 85.3%. Under the restriction of perturbation of 100 points, the network accuracy is reduced to 12.8%. When the random point drop algorithm is attacked, a success rate of 94.6% can still be achieved. When the outlier remove algorithm is attacked, our method can still achieve a success rate of 95.6%. Experiments on the influence of the different number of perturbation points on the network are also conducted. When 25, 50, 75, and 100 points are perturbed, the accuracy of the PointNet is decreased to 33.5%, 21.7%, 16.5%, and 13.5%. Similarly, the accuracy of PointNet++ is decreased to 16.3%, 14.7%, 13.2%, and 12.8%. Conclusion The attack algorithm proposed in this paper consider the efficiency of the attack as well as the imperceptibility of the adversarial samples. The proposed method can attack the mainstream point cloud deep neural network efficiently and achieve better performance. Easily succeeding in the attack is still possible even when attacking several simple defense algorithms.

Key words

point cloud adversarial attack; saliency map; Chamfer distance; PointNet; PointNet++

0 引言

针对对抗样本的生成和对抗攻击的防御是信息安全领域的重要问题，在计算机图形学和计算机视觉方面的应用研究得到了普遍重视(Akhtar和Mian，2018)。针对图像分类卷积神经网络的攻击最早受到了研究者关注，在对深度神经网络进行谱分析的基础上，Szegedy等人(2014)指出由于深度神经网络的非线性导致输入与输出映射不连续，加上不充分的模型平均和不充分的正则化导致过拟合使得深度神经网络极易受到对抗样本的扰动攻击。为了提高神经网络扰动攻击的效率，Goodfellow等人(2015)提出了FGSM(fast gradient sign method)方法，通过计算得到模型的梯度方向并沿着其梯度的反方向添加扰动，从而拉大对抗样本与原始样本的距离。由于FGSM方法只涉及单次梯度更新，而单次梯度更新有时并不足以产生使模型出错的扰动，Kurakin等人(2017)提出了改进的I-FGSM(iterative fast gradient sign method)方法，该方法通过小步长多次迭代使用FGSM方法产生扰动，进一步增强了对神经网络的攻击效果。Papernot等人(2016)和Moosavi-Dezfooli等人(2016)则分别从深度神经网络对输入图像梯度信息和图像分类超平面角度出发以构造图像对抗样本，并提出相应的高效对抗攻击方法。Arnab等人(2018)则是从图像分割模型角度评估了分割网络对于图像对抗攻击的鲁棒性。Inkawhich等人(2019)通过在特征空间最小化矢量化表示源图像与目标图像的欧几里得距离，使源图像与目标图像在特征空间中尽可能相似，从而产生可迁移的对抗样本。Zhou等人(2020)提出一个无需数据训练替身模型实现黑盒对抗攻击的方法，利用生成对抗网络生成合成样本，以训练替身模型。

由于安全性在深度神经网络和智能系统中起着至关重要的影响，神经网络对对抗攻击的鲁棒性研究是回答深度学习能否广泛应用于物理世界的关键点。研究表明，处理2D图像的深度神经网络对高维空间中细微的扰动会逐层叠加，导致最终网络的输出结果产生较大偏差，使得在深度神经网络对抗样本面前显得极其脆弱(Goodfellow等，2015)。目前，深度神经网络在3D建模和几何处理方面的广泛使用，特别是离散点云数据由于其对复杂3维形状的表达能力强、扫描数据获取方便直接、在几何处理中无需维护大量的拓扑信息、具有成熟的点云形状绘制技术等优势而得到了广泛使用(缪永伟和肖春霞，2014)；然而，针对3D点云数据的深度神经网络的点云对抗攻击的研究并不多。

随着深度学习技术在点云处理方面的应用与发展，研究者提出了若干基于点云数据的神经网络，如PointNet (Qi等，2017a)，DGCNN(dynamic graph convolutional neural network)(Wang等，2019)，PointCNN(Li等，2018)，VV-Net(voxel VAE net)(Meng等，2019)等，从而使直接将点云数据作为神经网络输入并实现点云形状的建模和处理成为可能。张新良等人(2020)使用扩展点态卷积网络，进一步提升了点云神经网络的分类分割效果；杜静和蔡国榕(2021)提出了多特征融合与残差优化的点云语义分割方法，减弱了点云神经网络在大场景3维语义分割时所产生的边界现象。为了解决点云数据的旋转不变性和无序性问题，PointNet网络引进了输入变换和特征变换并有效提取点云特征信息，提出了端到端的点云多任务处理框架，包括点云分类、点云分割等。PointNet网络(Qi等，2017a)成为许多后续点云深度学习神经网络的骨架网络，然而PointNet在网络结构上仅使用了多层感知机和最大池化层，该网络缺少捕获输入点云模型局部结构的能力，导致其在模型细节处理和泛化能力均有限。PointNet+ +网络(Qi等，2017b)则在PointNet基础上进行了两方面改进以提升网络对局部特征信息的鲁棒提取，一方面利用空间距离度量，并使用PointNet对点云局部区域进行特征迭代提取，使其能够学到局部尺度越来越大的特征；另一方面，考虑到点云分布的不均匀性，PointNet+ +采用自适应方法以有效提取点云模型的特征信息。

值得注意的是，PointNet网络(Qi等，2017a)利用最大池化层生成点云模型的关键点集(critical point set)，具体地说，该点集将生成一个1 024维向量，其保留了整个模型的特征，PointNet分类结果和分割结果均依赖于此向量。由于该点集仅是点云模型中的一个子集，使得该网络在具有缺失数据和点云扰动噪声时仍具有很好的鲁棒性，在面对50 % 数据缺失和20 % 的离群点噪声时，PointNet依然能够实现80 % 以上的分类准确率。然而，Qi等人(2017a)仅简单测试了面对数据缺失或点云扰动噪声时PointNet的鲁棒性，并没有进一步对网络的鲁棒性进行探索。针对这一现象，本文将对点云深度神经网络面对对抗攻击时的表现进行相应研究和分析，对关键点子集的可视化结果如图 1所示，其中关键点集主要分布在物体的轮廓表面，根据可视化结果和最大池化层性质可知，处于点云内部的采样点将被最大池化层过滤，从而不会对点云攻击结果造成影响。针对以上问题，本文提出了一种基于显著性图的点云替换对抗攻击方法。该方法首先迭代计算并构建点云模型的显著性图，再利用Chamfer距离(Fan等，2017)衡量点云之间的差异性并选择与点云模型库中具有最近Chamfer距离的模型关键点集进行替换，从而实现最小化点云扰动。利用该方法能够生成不易被人眼察觉，但对点云分类结果产生较大影响的对抗样本。使用ModelNet40数据集(Wu等，2015)，分别在点云分类网络PointNet和PointNet+ +上进行对比实验，实验表明本文所提出的点云对抗攻击方法，不但考虑了对抗攻击效率，而且考虑了对抗样本的不可察觉性。此外，本文方法不仅能攻击PointNet(Qi等，2017a)以及PointNet+ + 网络(Qi等，2017b)，对于其他主流点云分类网络，如DGCNN网络(Wang等，2019)，PointCNN网络(Li等，2018)等同样适用。

图 1 点云关键点示意图

Fig. 1 Diagram of critical points of point cloud

((a)original point cloud; (b) critical points of point cloud)

1 问题描述

1.1 点云模型攻击的特点

点云模型是通过扫描设备或深度相机从物体表面直接采样得到的一种3维数据，其通常具有大量的坐标和属性信息(如颜色等)。一个采样点数目为$N$的点云模型可表示为$\mathit{\boldsymbol{X}} = \left\{ {{x_1}, {x_2}, \cdots, {x_N}} \right\} \subset {{\bf{R}}^{N \times 3}}$，其中每行记录一个采样点的3维坐标$\left({x, y, z} \right)$信息，根据需要可包含颜色信息，如$\left({x, y, z, r, g, b} \right)$等。不同于具有规则像素点分布的2D图像数据，点云数据分布由于具有不规则性、无序性等，使得现有的攻击方法难以直接应用到3维点云数据。此外，点云数据相对2维图像数据具有更少的约束，图像数据的攻击仅局限在图像内部改变其若干像素值，而点云模型由于其采样点空间分布的特点可以在任意位置增加采样点或改变采样点位置等信息，从而使得针对点云模型攻击的对抗样本搜索空间得到大大扩展。理论上，如果不对搜索空间进行约束则无法选取适当的方法产生对抗样本。针对点云深度神经网络的攻击主要包括目标攻击和非目标攻击。

1.2 目标攻击(targeted attacks)

目标攻击通常应用在多分类问题上，其目的是将深度神经网络误导到特定的分类类别。攻击者制造对抗样本欺骗神经网络，使得被攻击的点云模型被神经网络分类为指定的类别。如Xiang等人(2019)在PointNet模型上通过在由点云数据表示的瓶子周围添加点云簇，使得网络错误地将瓶子分类为桌子等物体。形式上，对于一个点云分类网络模型$F:\mathit{\boldsymbol{X}} \to \mathit{\boldsymbol{Y}}$，其将$x \to \mathit{\boldsymbol{X}}$映射到对应标签$y \to \mathit{\boldsymbol{Y}}$；目标攻击首先确定要攻击的目标类别$t$，其次制造对抗样本${x^{{\rm{adv}}}}$，点云神经网络模型将被误导，输出将被映射到$t \in \mathit{\boldsymbol{Y}}$，并满足

$ \min D\left({x, {x^{{\rm{adv}}}}} \right)\;\;{\rm{s}}.\;{\rm{t}}.\;\;\mathit{F}\left({{x^{{\rm{adv}}}}} \right) = t $

(1)

式中，$D\left({x, {x^{{\rm{adv}}}}} \right)$代表$x$和${{x^{{\rm{adv}}}}}$之间的距离，在本文中使用Chamfer距离。

1.3 非目标攻击(non-targeted attacks)

非目标攻击可以看做目标攻击的一种特例，其与目标攻击不同之处在于非目标攻击不会为神经网络的输出指定一个特定的类别，除了输出不能是正确的类别之外，其预测任何一个类别均可以接受。例如，Liu等人(2019)将FGSM方法应用到PointNet网络上实现了无目标攻击，使PointNet网络输出了错误的分类类别。与目标攻击相比，非目标攻击通常更容易实现，这是由于非目标攻击具有更多的选择参数和特征空间用以重定向输出。在非目标攻击任务中，用于神经网络训练的对抗样本通常可以由两种方式产生：一种是同时通过运行多个目标攻击实例，并从中选择使攻击扰动达到最小的样本作为对抗样本；另一种则寻求能够最小化正确类别概率的样本作为对抗样本。

2 本文方法

2.1 点云模型显著性图的构造

如引言所述，位于点云模型中心的采样点会被PointNet等网络中的最大池化层过滤掉，点云模型中心位置的采样点对于点云分类神经网络的最终分类结果的影响并不大，因而从点云模型中删除一个采样点和将采样点移动到模型中心将有着相似的分类结果。具体地，可以将一个点云模型$\mathit{\boldsymbol{X}}$分成两部分$\mathit{\boldsymbol{X}}\mathit{', C}$, 其中$\mathit{\boldsymbol{C}}$代表点云形心(centroid)采样点点集，$\mathit{\boldsymbol{X}}\mathit{'}$则代表其余的分布在3维物体表面的离散采样点点集，在原始点云模型输入时形心点集通常是一个空集。此时，PoinNet网络中的最大池化层等价于

$ MAX(h(\mathit{\boldsymbol{X}})) = {\rm{max}}(MAX(h(\mathit{\boldsymbol{X}}\prime)), MAX(h(\mathit{\boldsymbol{C}}))) $

(2)

式中，${\rm{max}}(a, b)$得到$a$和$b$中较大的元素, $h$为PointNet中的特征提取层，$MAX$则为最大池化层，此时式(2)等价为

$ MAX(h(\mathit{\boldsymbol{X}}) = {\rm{ }}MAX(h(\mathit{\boldsymbol{X}}\prime)) $

(3)

为了构建点云模型显著性图，针对输入的点云模型采样点数据需要计算每一采样点对深度神经网络分类结果的贡献。Zheng等人(2019)首先计算该采样点在点云模型中如PointNet网络的损失函数值，其次计算该采样点不在点云模型时PointNet网络的损失函数值，求其差值并将其作为该采样点对深度神经网络分类结果的贡献值，即采样点的显著性值。在计算采样点显著性度量中，为了满足度量计算的视角不变性，先在以模型中心为球心的球面坐标系下计算各采样点位置，采样点位置可表示为$\left({r, \mathit{\Psi }, \varphi } \right)$，其中$r$为采样点到模型中心的距离，$\mathit{\Psi }$和$\varphi $分别表示采样点在球面坐标系下的经度和维度。该方法将点云模型中所有采样点3个坐标的中位数作为点云中心的坐标值，记作${x_c}$表示为

$ {x_{ck}} = md(\{ {x_{ik}}\mid {x_i} \in \mathit{\boldsymbol{X}}\}), k = 1, 2, 3 $

(4)

式中，$\left({{x_{i1}}, {x_{i2}}, {x_{i3}}} \right)$表示采样点${x_i}$所对应的正文坐标下的坐标值，而$md\left({\; \cdot \;} \right)$为取中位数操作。

在该球面坐标系下，将单个采样点向点云中心移动$\delta $长度的距离将增加损失函数值$ - \frac{{\partial L}}{{\partial {r_i}}}\delta $，其中$\frac{{\partial L}}{{\partial {r_i}}}$计算为

$ \frac{{\partial L}}{{\partial {r_i}}} = \sum\limits_{k = 1}^3 {\frac{{\partial L}}{{\partial {x_{ik}}}}} \frac{{{x_{ik}} - {x_{ck}}}}{{{r_i}}} $

(5)

式中，${r_i} = \sqrt {\sum\limits_{k = 1}^3 {{{\left({{x_{ik}} - {x_{ck}}} \right)}^2}} } $。因此，采样点$x_i$的显著性值计算为

$ {s_i} = \frac{{\partial L}}{{\partial {r_i}}}{r_i} $

(6)

2.2 基于Chamfer距离的点云对抗攻击

本文主要对点云模型进行非目标攻击，其目的是使PointNet等网络对点云模型分类时产生错误的输出。值得注意的是，由于点云分类网络的结果受到多种因素的影响，因此并不会因为替换了点云模型的关键点就产生分类结果可控的点云模型。一般地说，对抗攻击可以定义为一个受到约束的优化问题。本文约定$f$为接受点云输入的目标分类网络，点云$\mathit{\boldsymbol{X}} = \left\{ {{x_i}\mid i = 1, 2, \cdots, N} \right\}$，对于原始样本$\mathit{\boldsymbol{X}}$，设分类网络将其正确分类为类别$t$，则网络输出的概率值为${f_t}(\mathit{\boldsymbol{X}})$。定义$\mathit{\boldsymbol{e}}(x) = \left({{e_1}, {e_2}, \cdots, {e_n}} \right)$为扰动，类别$t' \ne t$，则该问题定义为

$ \begin{array}{l} \mathop {{\mathop{\rm argmax}\nolimits} }\limits_{e(x)} {f_{t'}}(\mathit{\boldsymbol{X}} + \mathit{\boldsymbol{e}}(x)){\rm{ }}\\ {\rm{s}}{\rm{. t}}{\rm{. }}\quad N(\mathit{\boldsymbol{e}}(x)) \le n \end{array} $

(7)

式中，$N(\mathit{\boldsymbol{e}}(x))$代表$\mathit{\boldsymbol{e}}(x)$的元素个数，$n$为被扰动的采样点个数。本文方法解决了两个关键问题：1) 扰动哪些点；2) 如何扰动这些点。

点云深度神经网络如PointNet网络的输出主要取决于关键点贡献的特征信息，记关键点集为$\mathit{\boldsymbol{U}} = \left\{ {{\mathit{\boldsymbol{u}}_i}\mid i = 1, 2, \cdots, k} \right\}$，则点云$\mathit{\boldsymbol{X}} = \left\{ {{x_i}\mid i = 1, 2, \cdots, N} \right\}$可表示为

$ \begin{array}{l} \;\;\;\;\mathit{\boldsymbol{X}} = \left\{ {{x_i}} \right\} = \{ \mathit{\boldsymbol{Z}}, \mathit{\boldsymbol{U}}\} \\ m + k = N, i = 1, 2, \cdots, N \end{array} $

(8)

式中，$\{ \mathit{\boldsymbol{Z}}\} $为$\{ \mathit{\boldsymbol{X}}\} $与$\{ \mathit{\boldsymbol{U}}\} $的差集。综上所述，对于问题1)，在限制被扰动的采样点的数量小于等于$n$的前提下，由于具有高显著性值的采样点对例如PointNet等点云深度神经网络分类结果提供了更多的特征信息，因此，只需选取这些具有高显著性值的采样点，也即关键集中的采样点${u_i} \in \mathit{\boldsymbol{U}}$，即可最大化扰动效果。

针对点云模型来说，Chamfer距离(Fan等，2017)反映了两个点云模型之间的差异性，两个点云模型之间差异越小则其Chamfer距离也越小。设有2个点云${\mathit{\boldsymbol{X}}_1}$和${\mathit{\boldsymbol{X}}_2}$，其Chamfer距离定义如下

$ \begin{array}{l} d\left({{\mathit{\boldsymbol{X}}_1}, {\mathit{\boldsymbol{X}}_2}} \right) = \sum\limits_{x \in {\mathit{\boldsymbol{X}}_1}} {\mathop {\min }\limits_{y \in {\mathit{\boldsymbol{X}}_2}} } \left\| {x - y} \right\|_2^2 + \\ \;\;\;\;\;\;\;\;\;\;\;\;\sum\limits_{y \in {\mathit{\boldsymbol{X}}_2}} {\mathop {\min }\limits_{x \in {\mathit{\boldsymbol{X}}_1}} } \left\| {y - x} \right\|_2^2 \end{array} $

(9)

对于点云模型$\mathit{\boldsymbol{X}} = \left\{ {{x_i}} \right\}$，为生成其对应的对抗样本${\mathit{\boldsymbol{X}}^{{\rm{adv}}}}$，本文方法在整个样本空间中进行搜索，寻找与当前所选择的样本${\mathit{\boldsymbol{X}}_i}$具有最小Chamfer距离的样本${\mathit{\boldsymbol{X}}_k}$, 并利用上节构建显著性图，替换样本${\mathit{\boldsymbol{X}}_i}$与${\mathit{\boldsymbol{X}}_k}$的关键点，从而完成整个攻击流程。这里利用Chamfer距离衡量点云差异并替换关键点集的方法，其等价于一种启发式的搜索扰动空间方法，该方法具有以下优点：1)理论上无法直接在无限制的3D空间上搜索解，本文方法将扰动搜索空间从整个3D空间转变成为有限样本空间，从而使得以较小的计算代价枚举求解式(7)成为可能；2)本文方法选择替换点云之间的关键点，则其扰动将被自然限制在一定范围内，即${e_i} \in \mathit{\boldsymbol{e}}\left(x \right) \le \delta $, 其中，$\delta $是一个未知量并随计算发生变化。本文方法选取Chamfer距离最小的两个点云模型，因此能在一定程度上最小化扰动，使得被扰动后的点云具有良好的不可察觉性。

2.3 点云替换对抗攻击方法

结合点云模型的显著性图与Chamfer距离，本文提出一种有效的点云替换对抗攻击方法，该方法为了实现针对点云模型的对抗攻击，其在构建点云显著性图时，通过迭代和小批量采样点移除方式，以最大程度地保留模型采样点之间的内在依赖性。

具体来说，在构建点云显著性图的过程中，每次迭代会将$n$/$T$个采样点移动到点云模型中心，其中$n$为总共被扰动的采样点，$T$为迭代次数，并计算点云深度神经网络的损失函数值的梯度，结合式(6)，计算采样点的显著性值$s_i$。在关键点替换过程中，本文方法首先计算与样本${\mathit{\boldsymbol{X}}_i}$具有最小Chamfer距离的样本${\mathit{\boldsymbol{X}}_k}$，并将${\mathit{\boldsymbol{X}}_i}$中具有最高显著性值的$n$个采样点替换为${\mathit{\boldsymbol{X}}_k}$中具有最高显著性值的$n$个采样点。

由于具有高显著性值的采样点对例如PointNet等点云深度神经网络分类结果提供了更多的特征信息，因此在点云模型对抗攻击中替换具有最高显著性值的采样点将能够有效保证攻击效果的最大化。本文方法描述归纳为如下算法：

算法1点云替换对抗攻击

需求：损失函数$L(\mathit{\boldsymbol{X}}, y;\theta)$；模型权重$\theta $，扰动点数$n$和迭代次数$T$。

输入：点云模型$ \mathit{\boldsymbol{X}}$，标签$y$。

输出：对抗样本${\mathit{\boldsymbol{X}}^{{\rm{adv}}}}$。

1) 计算当前点云模型$\mathit{\boldsymbol{X}}$与其余点云模型的Chamfer距离，找到具有最小Chamfer距离的点云模型$\mathit{\boldsymbol{X}}'$；

2) 计算损失函数梯度$g_i^t = \nabla x_i^tL\left({{\mathit{\boldsymbol{X}}^t}, y;\theta } \right)$；

3) 计算点云模型中心为

$ x_c^t = \left({x_{c1}^t, x_{c2}^t, x_{c3}^t} \right) = \mathit{md}\left({x_{i1}^t, x_{i2}^t, x_{i3}^t} \right); $

4) 计算${r_i}\frac{{\partial L}}{{\partial {r_i}}} = \left({x_i^t - x_c^t} \right)g_i^t$；

5) 通过${s_i} = - {r_i}\frac{{\partial L}}{{\partial {r_i}}}$构建点云显著性图；

6) 将$n/T$个$s_i$值最大的采样点移到模型中心；

7) 迭代执行步骤2)—6)共$T$次；

8) 将$\mathit{\boldsymbol{X}}$中$n$个$s_i$值最大的采样点替换为$\mathit{\boldsymbol{X}}'$中$n$个$s_i$值最大的采样点，得到${\mathit{\boldsymbol{X}}^{{\rm{adv}}}}$。

3 实验结果与讨论

3.1 数据集

采用数据集ModelNet40(Wu等，2015)作为实验数据集，包含40种不同类别的点云数据(如飞机、汽车等)。ModelNet40数据集中包含12 311种不同的点云样本，其中9 843个样本作为训练集，2 468个样本作为测试集。

3.2 实验设置

本文点云对抗攻击方法在搭载3.60 GHz Interl ® Core$^TM$i9-9900KF的PC机上实现，内存为32 GB，显卡使用GeForce RTX 3080，显存10 GB。操作系统为Ubuntu20.04，编程语言为python3.7，框架为Jittor，版本为1.2.2.62。本文使用主流点云分类网络PointNet和PointNet+ +进行测试。进行网络训练时，对于PointNet网络采用SGD(stochastic gradient descent)优化算法，初始学习率为0.02，权重衰减指数0.6，批大小为8，模型迭代次数为300次；对于PointNet+ +网络，训练参数设置与PointNet网络保持一致，训练时除了点云数据，同时使用了点云法向信息。

在攻击防御算法时(即在将对抗样本输入点云分类网络之前，首先对对抗样本应用随机点移除算法或离群点移除算法获得“去噪”后的点云样本，然后将经“去噪”后得到的点云样本输入点云分类网络进行分类)，本文将扰动点个数限制为200个。对于随机点移除算法，本文分别评估移除100、200、300、400、500个采样点时的防御效果，评估指标为攻击成功率，攻击成功率越高，防御效果越好。对于离群点移除算法，该方法计算离采样点$x_i$最近的$k$个采样点的距离${d_{il}}\left({l = 1, 2, \cdots, k} \right)$，并计算${d_{il}}$的平均值${\bar d_{i}}$；设点云模型$\mathit{\boldsymbol{X}}$共有$N$个采样点，分别对各采样点执行上述计算得到${h_i}\left({i, 2, \cdots, N} \right)$，对$h_i$求平均数，再求该平均数的标准差$H$并将其作为阈值；若采样点$x_i$与其最近$k$个采样点的平均距离${\bar d_{i}}$大于$\alpha H$，则将$x_i$作为离群点并将其去除，其中$\alpha $是一个常量参数，本文中对$\alpha $取2.0、1.5、1.0，0.5、0.001时的情况分别进行实验。

3.3 实验结果分析与讨论

3.3.1 扰动采样点数目对网络准确率的影响

理论上，在点云对抗攻击中扰动采样点数目越多，诸如PointNet网络等的分类准确率就越低。如图 2所示，该实验展示了扰动采样点数目对于网络分类准确率的影响，noise代表关键点替换为噪声的算法，drop代表直接删除关键点的算法。在ModelNet40数据集上，对点云模型扰动25、50、75、100个采样点的情形下，PointNet网络的分类准确率从89.2 % 分别下降到15.3 %、11.1 %、8.1 %、6.2 %；而PointNet+ +网络的分类准确率从91.9 % 分别下降到16.9 %、15.0 %、13.8 %、12.8 %。而对比2个基线算法，Zheng等人(2019)方法在攻击100个采样点时，仅使PointNet网络的分类准确率下降到66 % ~67 %，使PointNet+ +网络的分类准确率下降到20 % ~21 %。

图 2 扰动点个数对网络准确率的影响

Fig. 2 Influence of number of perturbation points on network accuracy

((a)accuracy of PointNet; (b)accuracy of PointNet+ +)

3.3.2 不同方法的对抗攻击成功率比较

一般地，神经网络的攻击成功率定义为被攻击后会错误分类到其他类别的点云样本占原本可被正确分类的点云样本的百分比。将对本文点云替换对抗攻击方法与其他典型方法进行比较，如表 1和表 2所示，其中扰动采样点数目限制为200。对于PointNet网络，本文方法相比攻击成功率第2的I-FGSM方法(Kurakin等，2017)攻击成功率高出7.3 %，相比FGSM方法(Goodfellow等，2015)攻击成功率高出38.6 %，而相比JSMA(Jacobian-based saliency map attack)方法(Papernot等，2016)攻击成功率高出41 %。对于PointNet+ +网络，本文方法攻击成功率比FGSM方法高出58.6 %，比JSMA方法攻击成功率高出85.3 %，仅低于I-FGSM方法1.3 %。然而，本文方法具有更多的灰盒特性：该方法仅采用网络梯度计算采样点的显著性，在攻击时其并不依赖梯度调整扰动方向和幅度，因此本文方法的迭代次数更少，通常为采样点数目的1/5，如扰动100个采样点仅需要20次；而I-FGSM方法一般需要进行100次迭代。

表 1 不同攻击算法在PointNet网络上的比较
Table 1 Comparison of different attack algorithms on PointNet

下载CSV

方法	攻击成功率/%
FGSM(Goodfellow等，2015)	58.8
I-FGSM(Kurakin等，2017)	90.1
JSMA(Papernot等，2016)	56.4
本文	97.4
注：加粗字体表示最优结果。

表 2 不同攻击算法在PointNet+ +网络上的比较
Table 2 Comparison of different attack algorithms on PointNet+ +

下载CSV

方法	攻击成功率/%
FGSM(Goodfellow等，2015)	36.5
I-FGSM(Kurakin等，2017)	96.4
JSMA(Papernot等，2016)	9.8
本文	95.1
注：加粗字体表示最优结果

3.3.3 面对防御方法时的对抗攻击效果检测

将攻击成功率作为攻击效果的衡量指标，攻击成功率越高，则攻击效果越好，反之则攻击效果越差。图 3和图 4展示了本文点云替换对抗攻击方法的对抗攻击效果。实验表明，随机点移除方法对于本文的攻击方法没有起到防御效果，本文方法的攻击成功率在PointNet网络上保持在97 % 以上，在PointNet+ + 网络上则保持在93 % 以上，对于本文中用于对比的两种基线算法(直接删除具有高显著性采样点的方法和将具有高显著性值的采样点替换为噪声的方法)，随机点移除算法对其攻击成功率的影响不足1 %，也被认为不具有防御效果。对于离群点移除算法，随着标准差系数$\alpha $的减小，对于直接删除具有高显著性值的采样点将有着明显的防御作用。对于PointNet网络，使得其他的点扰动方式攻击方法的成功率从86.9 % 降低到55.7 %；同样地，对于PointNet+ +网络，其他的点扰动方式攻击方法的成功率亦从93.6 % 降低到81.9 %。但对于本文提出的结合显著性图与Chamfer距离的点云替换对抗攻击方式，对于PointNet网络的攻击成功率在97 % 以上，而对于PointNet+ +网络的攻击成功率在94 % 以上，该防御算法对本文方法基本无效。

图 3 离群点移除算法防御效果

Fig. 3 Defense effect of outlier remove algorithm

((a) success rate of PointNet; (b) success rate of PointNet+ +)

图 4 随机点移除算法防御效果

Fig. 4 Defense effect of random drop algorithm

((a) success rate of PointNet; (b) success rate of PointNet+ +)

3.3.4 迭代次数对攻击效果的影响

在构建点云显著性图时，采用迭代的方式移除点云模型中具有高显著性值的采样点，以最大程度地保留模型采样点之间的内在依赖性。本节中将对迭代次数对攻击效果的影响进行分析。该实验分别计算迭代次数为1、2、4、5、10、20次时点云分类模型的准确率，并将被扰动采样点的个数设置为100。如图 5所示，在迭代次数为20次时，点云分类网络的分类准确率为6.2 %；在迭代次数为1次时，点云分类网络的分类准确率为27.5 %，即攻击效果更差。

图 5 迭代次数对攻击效果的影响

Fig. 5 Impact of the number of iterations on the attack

3.3.5 采样点扰动的可察觉性

采样点扰动的可察觉性具体表现为原始点云样本与对抗点云样本之间的差异性，即两组点云之间的差异越大则不可察觉性越差。将原始点云样本与对抗点云样本进行可视化，如图 6所示。从中可知，FGSM方法(Goodfellow等，2015)与I-FGSM方法(Kurakin等，2017)在进行扰动时完全破坏了点云模型3维物体的结构，其不具备不可察觉性。JSMA方法(Papernot等，2016)能够很好地维持点云模型3维结构，但将生成较多的离群点，根据表 1和表 2可知，JSMA方法为攻击效果较差的方法(对PointNet和PointNet+ +网络分别为56.4 %、9.8 %，而本文方法攻击成功率为97.4 %、95.1 %)。本文方法比JSMA方法产生更少的离群点，且攻击成功率更好，其在进行高效攻击的同时能够保持点云模型的内在结构。

图 6 原始样本和对抗样本示例

Fig. 6 Demonstration of original and adversarial samples

((a) original point cloud; (b) adversarial examples generated by FGSM; (c) adversarial examples generated by I-FGSM; (d) adversarial examples generated by JSMA; (e) adversarial examples generated by ours)

4 结论

针对离散点云模型，由于模型关键点集对诸如PointNet等深度神经网络的点云分类结果能提供更多的特征信息，本文提出了一种基于显著性图的点云替换对抗攻击方法。首先构建点云显著性图，目的是找到具有最高显著性值的采样点，以达到最大化攻击效果的目的，其次利用Chamfer距离衡量点云模型之间的差异性，找到与被选择的点云模型最相似的点云模型，并替换两组点云模型之间的关键点，因此能在进一步强化攻击效果的同时保证点云模型具有良好的不可察觉性。在具有代表性的点云处理网络(如PointNet和PointNet+ +)上进行了实验，结果表明本文方法在PointNet网络上的攻击效果更优于传统对抗攻击算法；在PointNet+ +网络上同样具有良好表现，同时本文提出了可高效攻击离群点移除和随机点移除2种防御算法。

然而，由于本文方法在生成对抗样本时除原始样本外还依赖于样本集里的其他样本，为了减轻该依赖关系，后续可以考虑采用进化算法生成点云对抗样本。

参考文献

Akhtar N, Mian A. 2018. Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access, 6: 14410-14430 [DOI:10.1109/ACCESS.2018.2807385]

Arnab A, Miksik O and Torr P H S. 2018. On the robustness of semantic segmentation models to adversarial attacks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 888-897[DOI: 10.1109/CVPR.2018.00099]

Du J, Cai G R. 2021. Point cloud semantic segmentation method based on multi-feature fusion and residual optimization. Journal of Image and Graphics, 26(5): 1105-1116 (杜静, 蔡国榕. 2021. 多特征融合与残差优化的点云语义分割方法. 中国图象图形学报, 26(5): 1105-1116) [DOI:10.11834/jig.200374]

Fan H, Su H and Guibas L J. 2017. A point set generation network for 3d object reconstruction from a single image//Proceedings of 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 605-613[DOI: 10.1109/CVPR.2017.264]

Goodfellow I J, Shlens J and Szegedy C. 2015. Explaining and harnessing adversarial examples[EB/OL]. [2021-06-15]. https://arxiv.org/pdf/1412.6572v3.pdf

Inkawhich N, Wen W, Li H H and Chen Y R. 2019. Feature space perturbations yield more transferable adversarial examples//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7059-7067[DOI: 10.1109/cvpr.2019.00723]

Kurakin A, Goodfellow I J and Bengio S. 2017. Adversarial examples in the physical world[EB/OL]. [2021-06-15]. https://arxiv.org/pdf/1607.02533.pdf

Li Y Y, Bu R, Sun M C, Wu W, Di X H and Chen B Q. 2018. PointCNN: convolution on X-transformed points//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc. : 828-838

Liu D, Yu R and Su H. 2019. Extending adversarial attacks and defenses to deep 3D point cloud classifiers//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE: 2279-2283[DOI: 10.1109/ICIP.2019.8803770]

Meng H Y, Gao L, Lai Y K and Manocha D. 2019. VV-Net: voxel VAE net with group convolutions for point cloud segmentation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8499-8507[DOI: 10.1109/ICCV.2019.00859]

Miao Y W, Xiao C X. 2014. Geometric Processing and Shape Modeling of 3D Point-Sampled Models. Beijing: Science Press (缪永伟, 肖春霞. 2014. 三维点采样模型的几何处理和形状造型. 北京: 科学出版社)

Moosavi-Dezfooli S M, Fawzi A and Frossard P. 2016. DeepFool: a simple and accurate method to fool deep neural networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2574-2582[DOI: 10.1109/CVPR.2016.282]

Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z B and Swami A. 2016. The limitations of deep learning in adversarial settings//Proceedings of 2016 IEEE European Symposium on Security and Privacy. Saarbrucken, Germany: IEEE: 372-387[DOI: 10.1109/EuroSP.2016.36]

Qi C R, Su H, Mo K and Guibas L J. 2017a. PointNet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 652-660[DOI: 10.1109/CVPR.2017.16]

Qi C R, Li Y, Su H and Guibas L J. 2017b. PointNet++: deep hierarchical feature learning on point sets in a metric space[EB/OL]. [2021-06-15]. https://arxiv.org/pdf/1706.02413.pdf

Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I and Fergus R. 2014. Intriguing properties of neural networks[EB/OL]. [2021-06-15]. https://arxiv.org/pdf/1312.6199.pdf

Wang Y, Sun Y B, Liu Z W, Sarma S E, Bronstein M M, Solomon J M. 2019. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5): #146 [DOI:10.1145/3326362]

Wu Z R, Song S R, Khosla A, Yu F, Zhang L G, Tang X O and Xiao J X. 2015. 3D ShapeNets: a deep representation for volumetric shapes//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1912-1920[DOI: 10.1109/CVPR.2015.7298801]

Xiang C, Qi C R and Li B. 2019. Generating 3D adversarial point clouds//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 9128-9136[DOI: 10.1109/CVPR.2019.00935]

Zhang X L, Fu C L, Zhao Y J. 2020. Extended pointwise convolution network model for point cloud classification and segmentation. Journal of Image and Graphics, 25(8): 1551-1557 (张新良, 付陈琳, 赵运基. 2020. 扩展点态卷积网络的点云分类分割模型. 中国图象图形学报, 25(8): 1551-1557) [DOI:10.11834/jig.190508]

Zheng T H, Chen C Y, Yuan J S, Li B and Ren K. 2019. Point cloud saliency maps//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1598-1606[DOI: 10.1109/ICCV.2019.00168]

Zhou M Y, Wu J, Liu Y P, Liu S C and Zhu C. 2020. DaST: data-free substitute training for adversarial attacks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 231-240[DOI: 10.1109/CVPR42600.2020.00031]