Print

发布时间: 2020-04-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190308
2020 | Volume 25 | Number 4




    图像处理和编码    




  <<上一篇 




  下一篇>> 





结合矩阵分解与差分隐私的人脸图像发布
expand article info 张啸剑1, 付聪聪1, 孟小峰2
1. 河南财经政法大学计算机与信息工程学院, 郑州 450046;
2. 中国人民大学信息学院, 北京 100872

摘要

目的 人脸图像蕴含着丰富的个人敏感信息,直接发布可能会造成个人隐私泄露。为了保护人脸图像中的隐私信息,提出3种基于矩阵分解与差分隐私技术相结合的人脸图像发布算法,即LRA(low rank-based private facial image release algorithm)、SRA(SVD-based private facial image release algorithm)和ESRA(enhanced SVD-based private facial image release algorithm)。方法 为了减少拉普拉斯机制带来的噪音误差,3种算法均将人脸图像作为实数域2维矩阵,充分利用矩阵低秩分解与奇异值分解技术压缩图像。在SRA和ESRA算法中,如何选择矩阵压缩参数r会直接制约由拉普拉斯机制引起的噪音误差以及由矩阵压缩导致的重构误差。SRA算法利用启发式设置参数r,然而r值增大导致过大的噪音误差,r值减小导致过大的重构误差。为了有效均衡这两种误差,ESRA算法引入一种基于指数机制的挑选参数r的方法,能够在不同的分解矩阵中挑选合理的矩阵尺寸来压缩人脸图像,然后利用拉普拉斯机制对挑选的矩阵添加相应的噪音,进而使整个处理过程满足ε-差分隐私。结果 基于6种真实人脸图像数据集,采用支持向量机(support vector machine,SVM)分类技术与信息熵验证6种算法的正确性。从算法的准确率、召回率、F1-Score,以及信息熵度量结果显示,提出的LRA、SRA与ESRA算法均优于LAP(Laplace-based facial image protection)、LRM(low-rank mechanism)以及MM(matrix mechanism)算法,其中ESRA算法在Faces95数据集上的准确率和F1-Score分别是LRA、LRM和MM算法的40倍、20倍和1倍多。相对于其他5种算法,ESRA算法对数据集大的变化相对稳定,可用性最好。结论 本文算法能够实现满足ε-差分隐私的敏感人脸图像发布,具有较好的可用性与鲁棒性,并且为灰度人脸图像的隐私保护提供了新的指导方法与思路,能有效用于社交平台和医疗系统等领域。

关键词

人脸图像; 隐私保护; 差分隐私; 矩阵分解; 低秩分解; 奇异值分解

Private facial image publication through matrix decomposition
expand article info Zhang Xiaojian1, Fu Congcong1, Meng Xiaofeng2
1. School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou 450046, China;
2. School of Information, Renmin University of China, Beijing 100872, China
Supported by: National Natural Science Foundation of China (61502146, 91646203, 91746115, 61772131); Natural Science Foundation of Henan Province, China(162300410006)

Abstract

Objective Facial images are widely used in many applications such as social media, medical systems, and smart transportation systems. Such data, however, are inherently sensitive and private. Individuals' private information may be leaked if their facial images are released directly in the application systems. In social network platforms, attackers can use the facial images of individuals to attack their sensitive information. Many classical privacy-preserving methods, such as k-anonymous and data encryption, have been proposed to handle the privacy problem in facial images. However, the classical methods always rely on strong background assumptions, which cannot be supported in real-world applications. Differential privacy is the state-of-the-art method used to address the privacy concerns in data publication, which provides rigorous guarantees for the privacy of each user by adding randomized noise in Google Chrome, Apple iOS, and macOS. Therefore, to protect the private information in facial images, this paper proposes three efficient algorithms, namely, low rank-based private facial image release algorithm (LRA), singular value decomposition (SVD)-based private facial image release algorithm (SRA), and enhanced SVD-based private facial image release algorithm (ESRA), which are based on matrix decomposition combined with differential privacy. Method The three algorithms employed the real-valued matrix to model facial images in which each cell corresponds to each pixel point of facial images. Based on the real-valued matrix, the neighborhood of some facial images can be defined easily, which are crucial bases to use Laplace mechanism to generate Laplace noise. Then, LRA, SRA, and ESRA rely on low-rank decomposition and SVD to compress facial images. This step aims to reduce the Laplace noise and boost the accuracy of the publication of facial images. The three algorithms use the Laplace mechanism to inject noise into each value of the compressed facial image to ensure differential privacy. Finally, the three algorithms use matrix algebraic operations to reconstruct the noisy facial image. However, in the SRA and ESRA algorithms, two sources of errors are encountered:1) Laplace error (LE) due to Laplace noise injected and 2) reconstruction error (RE) caused by lossy compression. The two errors are controlled by r parameter, which is the compression factor in the SRA and ESRA algorithms. Setting the compact parameter r constrains the LE and RE. The SRA algorithm sets the parameter in a heuristic manner in which one may fix the value in terms of experiences. However, the choice of r in the SRA algorithm is a problem because a large r leads to excessive LE, while a small r makes the RE extremely large. Furthermore, r cannot be directly set based on the real-valued matrix; otherwise, the choice of r itself violates differential privacy. Based on the preceding observation, the ESRA algorithm is proposed to handle the problem caused by the selection of the parameter r. The main idea of the ESRA algorithm involves two steps:employing exponential mechanism to sample r elements in the decomposition matrix and injecting the Laplace noise into the elements. According to the sequential composition of differential privacy, the two steps in the ESRA algorithm meet ε-differential privacy. Result On the basis of the SVM classification and information entropy technique, two group experiments were conducted over six real facial image datasets (Yale, ORL, CMU, Yale B, Faces95, and Faces94) to evaluate the quality of the facial images generated from the LRA, SRA, ESRA, LAP(Laplace-based facial image protection), LRM(low-rank mechanism), and MM(matrix mechanism) algorithms using a variety of metrics, including precision, recall, F1 score, and entropy. Our experiments show that the proposed LRA, SRA, and ESRA algorithms outperform LAP, LRM, and MM in terms of the abovementioned six metrics. For example, based on the Faces95 dataset, ε=0.1 and matrix=200×180 were set to compare the precision of ESRA, LRM, LRA, and LAP. Result show that the precision of ESRA is 40 and 20 times that of LAP, LRA, and LRM. Based on the six datasets, ESRA achieves better accuracy than LRA and SRA. For example, on the Faces94 dataset, the matrix=200×180 was set and the privacy budget ε (i.e., 0.1, 0.5, 0.9, and 1.3) was varied to study the utility of each algorithm. Results show that the utility measures of all algorithms increase when ε increases. When ε varies from 0.1 to 1.3, ESRA still achieves better precision, recall, F1-score, and entropy than the other algorithms. Conclusion The oretical analysis and extensive experiments were conducted to compare our algorithms with the LAP, LRM, and MM algorithms. Results show that the proposed algorithms could achieve better utility and outperform the existing solutions. In addition, the proposed algorithms provided new ideas and technical support for further research on facial image release with differential privacy in several application systems.

Key words

facial image; privacy protection; differential privacy; matrix decomposition; low-rank decomposition; singular value decomposition(SVD)

0 引言

信息技术的快速发展使得人脸图像的获取与识别变得尤为容易。社交网络服务平台分析用户上传的手机中的人脸图像时,通过对比多张人脸的相似度,判断是否同一个人;医疗系统通过分析收集的用户人脸图像,能够分析出性别、年龄、表情等多种人脸属性。然而人脸图像中蕴含着个人敏感信息,直接发布或者上传,会造成泄露。如何在保护个人隐私的前提下,发布人脸图像数据是目前亟待解决的问题。近年来研究者提出了多种隐私保护方法来保护人脸图像,通常使用的方法包括k-匿名、访问控制、隐私加密等。Newton等人(2005)Gross等人(2008)利用匿名化技术提出了k-same方法,对发布的灰度人脸图像进行匿名化操作,使攻击者通过发布的人脸图像重新甄别出用户身份的概率不超过1/k。然而,匿名化技术通常依赖于很强的背景知识假设,而相应的假设在现实中并不完全成立,进而使匿名化技术失效。例如Ilia等人(2015)指出,将匿名化的人脸图像放到Facebook社交平台,攻击者通过该平台额外的身份特征可以推理出匿名人脸图像中个人的社会安全号码,凭借社会安全号码,攻击者可以成功甄别出个人的疾病、住址等敏感信息。Sadeghi等人(2009)采用访问控制技术对社交平台上人脸图像的访问权限进行限制,达到了隐私保护的目的,但是该方法并不是从根本上对人脸图像进行隐私保护,而是在访问设置方面对图像进行保护。如果攻击者具有一定的背景知识和访问权限,则可以获取用户的原始人脸图像。Gross和Acquisti(2005)利用同态加密技术对灰度人脸图像进行加密处理,防止通信过程被甄别。然而数据加密技术同样是先对攻击做出相应的假设,再基于这些假设设计相应的加密算法,显然这类加密方法仅能针对某些特定的攻击。

当攻击者具有特定的背景知识时,匿名化与数据加密技术就会在一定程度上泄露用户的敏感人脸图像信息。目前,差分隐私已经成为一种新型隐私保护技术,该技术不关心攻击者具有多少背景知识,通常利用随机噪音实现隐私保护(Dwork等,2006Dwork和Lei,2009),研究对象已遍布关系数据、空间数据、流数据等多种数据类型,但对人脸图像数据鲜有涉及,主要挑战来自人脸图像的高维度以及如何保持发布精度。本文基于差分隐私提出高效的灰度人脸图像发布方法。主要贡献如下:

1) 为了减少人脸图像中噪音的影响,分别基于低秩分解与奇异值分解技术提出3种图像压缩方法,即LRA(low rank-based private facial image release algorithm)、SRA(singular value decomposition(SVD)-based private facial image release alqorithm)和ESRA(enhanced SVD-based privated facial image release algorithm),针对压缩的图像添加相应的拉普拉斯噪音。

2) 为了减少SRA和ESRA方法带来的重构误差,提出一种基于指数机制的分解矩阵尺寸选择方法,利用指数机制在不同的分解矩阵中挑选合适的矩阵尺寸进行图像压缩,并且使挑选过程满足差分隐私。

3) 理论证明,本文人脸图像发布方法能够满足$\varepsilon $-差分隐私,实验表明在6种真实的人脸库上具有有效性与可用性。

1 相关工作

随着网络的普及,用户数据的隐私保护受到研究者的广泛关注。目前,差分隐私保护模型依靠系统的统计学基础,打破了传统保护模型的思路,成为研究热点。直方图发布(Xu等,2013)、空间数据发布(Qardaji等,2013)、图数据发布(Zhang等,2015)、流数据发布(Chen等,2017)、时序数据发布(Rastogi和Nath,2010)和高维数据发布(Su等,2016)等已开展相应的研究。灰度人脸图像是一种较为复杂的个人数据,通常蕴含敏感信息,具有复杂性与易共享性。目前还没有研究者采用差分隐私技术对其进行保护,若共享平台直接发布,可能对个人隐私造成威胁。利用差分隐私保护技术发布人脸图像是亟待解决的问题。

傅里叶变换与小波变换是人脸图像处理常用的压缩技术,已有研究者采用这两种技术处理1维隐私数据。Rastogi和Nath(2010)借助傅里叶变换与差分隐私技术发布1维时序数据,主要思想是对1维时序数据进行傅里叶变换,然后对傅里叶系数添加拉普拉斯噪音;Xiao等人(2011)利用小波变换与差分隐私技术发布1维序列数据,主要思想与傅里叶变换类似,对小波系数进行噪音处理。然而这两种方法无法直接应用于人脸图像,原因是无法直接采用Rastogi和Nath(2010)以及Su等人(2016)的方法度量人脸图像发布的全局敏感性,进而无法计算需要的噪音数量。

实数域矩阵是人脸图像常用的表示形式,已有工作利用矩阵技术批处理查询(Li等,2015Yuan等,2015, 2016)与推荐系统(Friedman等,2016)。Li等人(2015)利用矩阵机制响应批处理查询,结合批处理矩阵搜索最优的策略矩阵,根据策略矩阵的敏感性添加拉普拉斯噪音,最后生成满足差分隐私的响应结果。然而Yuan等人(2015)指出矩阵机制具有较高的计算复杂性,响应精度通常低于直接在原始数据添加噪音的响应结果。主要原因是要求策略矩阵满足摩尔—彭若斯广义逆,而这一条件使得搜索到的策略矩阵未必是最优解。Yuan等人(2015)结合Li等人(2015)的不足提出了利用矩阵低秩分解机制响应最终的批处理查询,该机制基于1阶增广拉格朗日算子求最终的策略矩阵,进而避免策略矩阵必须满足摩尔—彭若斯广义逆约束。随后Yuan等人(2016)指出由于问题求解空间的非线性与非凸性,可能导致低秩策略求解过程的非收敛性,进而导致找不到全局最优的策略矩阵。此外,Friedman等人(2016)在评分矩阵中利用矩阵分解技术,根据损失函数与梯度下降法求解评分矩阵的分解结果,最后对分解结果添加噪音。然而评分矩阵的高度稀疏性与人脸图像矩阵有着本质区别。本文基于矩阵低秩与奇异值分解提出3种满足差分隐私的人脸图像发布方法,基本思想是利用矩阵分解将人脸图像中的特征信息提取出来,对其表示的特征信息矩阵添加相应的拉普拉斯噪音,发布的人脸图像具有较高的可用性。

2 定义与问题

2.1 基于近邻矩阵的差分隐私定义

给定一幅人脸图像,将每一个像素点作为一个单元,则人脸图像可以表示成一个2维数据矩阵${\mathit{\boldsymbol{A}}_{m \times n}}$,其中$m$表示矩阵$\mathit{\boldsymbol{A}}$的行数,$n$表示$\mathit{\boldsymbol{A}}$的列数。在给出差分隐私形式化定义之前,结合$\mathit{\boldsymbol{A}}$给出近邻关系的定义。

定义1  近邻矩阵。给定一幅尺寸为$m \times n$的人脸图像

$ {\mathit{\boldsymbol{A}}_{m \times n}} = \left[ {\begin{array}{*{20}{c}} {{a_{11}}}&{{a_{12}}}& \cdots &{{a_{1n}}}\\ {{a_{21}}}&{{a_{22}}}& \cdots &{{a_{2n}}}\\ {}&{}& \vdots &{}\\ {{a_{m1}}}&{{a_{m2}}}& \cdots &{{a_{mn}}} \end{array}} \right] $

$ {\mathit{\boldsymbol{A}}_{m \times n}} \in {\mathit{\boldsymbol{C}}^{m \times n}} $

式中,${\mathit{\boldsymbol{C}}^{m \times n}}$表示$m \times n$的实内积空间,${a_{ij}}$表示人脸图像中的某个像素点。用向量将$\mathit{\boldsymbol{A}}$表示成$\mathit{\boldsymbol{A}} = \left({{\mathit{\boldsymbol{a}}_1}, {\mathit{\boldsymbol{a}}_2}, \cdots {\mathit{\boldsymbol{a}}_n}} \right)$,其中${\mathit{\boldsymbol{a}}_i} = {\left({{a_{1i}}, {a_{2i}} \cdots {a_{mi}}} \right)^{\rm{T}}}$。如果$\mathit{\boldsymbol{A}}$$\mathit{\boldsymbol{A}}'$相差一个向量,则$\mathit{\boldsymbol{A}}$$\mathit{\boldsymbol{A}}'$互为近邻矩阵(Dwork等,2006)。

根据近邻矩阵$\mathit{\boldsymbol{A}}$$\mathit{\boldsymbol{A}}'$,重新定义$\varepsilon $-差分隐私模型。

定义2  $\varepsilon $-差分隐私。给定近邻矩阵$\mathit{\boldsymbol{A}}$$\mathit{\boldsymbol{A}}'$,设$M$为一个随机人脸图像发布方法,若$M$在近邻矩阵$\mathit{\boldsymbol{A}}$$\mathit{\boldsymbol{A}}'$上的任意一个输出结果$I$满足

$ \mathit{Pr}\left( {M\left( \mathit{\boldsymbol{A}} \right) = I} \right) \le \exp \left( \varepsilon \right) \times \mathit{Pr}\left( {M\left( {\mathit{\boldsymbol{A'}}} \right) = I} \right) $ (1)

$M$满足$\varepsilon $-差分隐私。式(1)中,$\varepsilon $表示发布人脸图像所需要的隐私代价, $Pr$为概率。$\varepsilon $值的大小直接制约方法$M$的隐私保护程度。

实现差分隐私保护最基本的要求是查询全局敏感性与噪音方法(Dwork等,2006)。查询敏感性是度量某一查询在近邻矩阵上的最大改变量。利用查询敏感性与隐私代价的比率作为噪音尺度参数,借助典型分布(拉普拉斯分布、指数分布等)产生差分隐私所需的噪音,利用噪音即可对数值型与非数值数据进行ε-差分隐私保护。

2.2 矩阵分解

矩阵分解是一种常用的数字图像压缩方法。本文利用矩阵低秩分解与矩阵奇异值分解来压缩人脸图像。

定义3  矩阵低秩分解。给定一幅人脸图像,其表示矩阵为${\mathit{\boldsymbol{A}}_{m \times n}}$。设$k\left({k \le \min \left\{ {m, n} \right\}} \right)$为其秩,若存在秩为$k$的两个矩阵${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{L}}_{k \times n}}$,使得

$ {\mathit{\boldsymbol{A}}_{m \times n}} = {\mathit{\boldsymbol{B}}_{m \times k}}{\mathit{\boldsymbol{L}}_{k \times n}} $ (2)

则称其为${\mathit{\boldsymbol{A}}_{m \times n}}$的低秩分解。式(2)中,${\mathit{\boldsymbol{B}}_{m \times k}}$表示$m \times k$的列满秩矩阵,${\mathit{\boldsymbol{L}}_{k \times n}}$表示$k \times n$的行满秩矩阵。由式(2)可知,对${\mathit{\boldsymbol{A}}_{m \times n}}$分解之后的主要特征信息保留在矩阵${\mathit{\boldsymbol{L}}_{k \times n}}$中。因此利用噪音机制对矩阵${\mathit{\boldsymbol{L}}_{k \times n}}$进行扰动则可以保护${\mathit{\boldsymbol{A}}_{m \times n}}$中的敏感信息。

定义4  矩阵奇异值分解。设$\mathit{\boldsymbol{A}}$为一幅人脸图像的表示矩阵,该矩阵对应的奇异值分解为

$ \mathit{\boldsymbol{A}} = \mathit{\boldsymbol{Q}}_A^{m \times m}\mathit{\boldsymbol{D}}_A^{m \times n}{\left( {\mathit{\boldsymbol{P}}_A^{n \times n}} \right)^{\rm{T}}} $ (3)

式中,${\mathit{\boldsymbol{Q}}_A}$$m \times m$的正交矩阵,${\mathit{\boldsymbol{D}}_A}$$m \times n$的对角阵,$\mathit{\boldsymbol{P}}_A^{\rm{T}}$$n \times n$的正交矩阵。当$m > n$时,${\mathit{\boldsymbol{D}}_A}$是由一个$n \times n$的对角子矩阵与${\mathit{\boldsymbol{\rm{0}}}^{\left({m - n} \right) \times n}}$组合而成的$m \times n$的对角阵。

2.3 问题描述

给定一幅人脸图像及其相应的表示矩阵$\mathit{\boldsymbol{A}}$,利用矩阵分解技术将$\mathit{\boldsymbol{A}}$拆解为多个矩阵,而多个矩阵的意义并不相同,在添加噪音时,只需对包含主要信息的矩阵采用拉普拉斯机制扰动,然后利用矩阵乘积运算重新恢复$\mathit{\boldsymbol{A}}$,进而得到$\mathit{\boldsymbol{\widetilde A}}$。在获得$\mathit{\boldsymbol{\widetilde A}}$的过程中存在两种误差,一种是拉普拉斯机制导致的噪音误差$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,另一种是矩阵重新构建过程中的重构误差$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。因此,发布人脸图像$\mathit{\boldsymbol{\widetilde A}}$的总体误差可以表示为

$ Er\left( {\mathit{\boldsymbol{\tilde A}}} \right) = RE\left( {\mathit{\boldsymbol{\tilde A}}} \right) + LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) $ (4)

在设计人脸图像发布方法时,应尽可能使总体误差$Er\left({\mathit{\boldsymbol{\widetilde A}}} \right)$最小。

3 基于矩阵分解的人脸图像处理方法

3.1 LAP发布方法

LAP(Laplace-based facial image protection)方法是直接利用拉普拉斯噪音(Dwork等,2006)对$\mathit{\boldsymbol{A}}$进行噪音扰动,发布扰动后的数据。LAP是最为直接的基于拉普拉斯机制的方法。

添加噪音的形式化公式可以表示为

$ \mathit{\boldsymbol{\tilde A}} = \mathit{\boldsymbol{A}} + Lap\left( {\frac{{{\Delta _1}\mathit{\boldsymbol{A}}}}{\varepsilon }} \right) $ (5)

式中,$Lap\left({} \right)$表示对矩阵加噪,${\Delta _1}\mathit{\boldsymbol{A = }}\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} $,即为矩阵$\mathit{\boldsymbol{A}}$最大的列范数。由式(5)可知,LAP方法是对$\mathit{\boldsymbol{A}}$中的每个数值添加拉普拉斯噪音,即${\mathit{\boldsymbol{\widetilde a}}_i} = {\mathit{\boldsymbol{a}}_i} + lap\left({{\Delta _1}\mathit{\boldsymbol{A/}}\varepsilon } \right)$$Lap\left({} \right)$表示对行向量加噪。

LAP发布方法满足$\varepsilon $-差分隐私(Zhang等,2015)。然而,$\mathit{\boldsymbol{A}}$为实数域矩阵,通常会造成比较大的${\Delta _1}\mathit{\boldsymbol{A}}$,进而导致发布结果的可用性很低。若采用误差平方和的期望值来度量$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,则LAP方法产生的误差为

$ \begin{array}{*{20}{c}} {LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = E\left( {\sum\limits_{i = 1}^n {error\left( {{{\tilde a}_i}} \right)} } \right) = E\left( {\sum\limits_{i = 1}^n {{{\left( {{{\tilde a}_i} - {a_i}} \right)}^2}} } \right) = }\\ {E\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} - {a_i} + \sum\limits_{j = 1}^m {lap} \left( {\frac{{{\Delta _1}\mathit{\boldsymbol{A}}}}{\varepsilon }} \right)} \right)}^2}} } \right) = }\\ {\sum\limits_{i = 1}^n {\frac{{2m{{\left( {{\Delta _1}\mathit{\boldsymbol{A}}} \right)}^2}}}{{{\varepsilon ^2}}}} = 2mn{{\left( {\frac{{{\Delta _1}\mathit{\boldsymbol{A}}}}{\varepsilon }} \right)}^2} = }\\ {2mn{{\left( {\frac{{\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} }}{\varepsilon }} \right)}^2}} \end{array} $ (6)

式中,$E$为期望。由式(6)可知,当图像矩阵的尺寸与${\Delta _1}\mathit{\boldsymbol{A}}$的数值较大时,产生的总误差就较大。因此,直接利用LAP发布尺寸较大的人脸图像,会导致发布结果的可用性较低。为此,本文基于矩阵分解提出了LRA、SRA以及ESRA等3种发布方法。

3.2 基于矩阵分解的发布方法

3.2.1 基于矩阵低秩分解的发布方法

首先利用矩阵低秩分解对原始人脸图像矩阵进行压缩,具体分为两步:1)采用矩阵分解的方法,将${\mathit{\boldsymbol{A}}_{m \times n}}$拆解为两个低秩矩阵${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{L}}_{k \times n}}$;2)对${\mathit{\boldsymbol{L}}_{k \times n}}$添加拉普拉斯噪音,即${\mathit{\boldsymbol{L}}_{k \times n}} + lap{\left({{\Delta _1}\mathit{\boldsymbol{L}}/\varepsilon } \right)^k}$,进而获得满足差分隐私的人脸图像矩阵${\mathit{\boldsymbol{\widetilde A}}_{m \times n}} = {\mathit{\boldsymbol{B}}_{m \times k}}\left({{\mathit{\boldsymbol{L}}_{k \times n}} + lap{{\left({{\Delta _1}\mathit{\boldsymbol{L}}/\varepsilon } \right)}^k}} \right)$。结合式(6)度量基于低秩矩阵分解产生的拉普拉斯误差$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,即

$ \begin{array}{*{20}{c}} {LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = \sum\limits_{i = 1}^n {error} \left( {{{\mathit{\boldsymbol{\tilde a}}}_i}} \right) = E\left( {\sum\limits_{i = 1}^n {{{\left( {{{\mathit{\boldsymbol{\tilde a}}}_i} - {\mathit{\boldsymbol{a}}_i}} \right)}^2}} } \right) = }\\ {E\left( {\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^n {{{\left( {{\mathit{\boldsymbol{b}}_i}\left( {{\mathit{\boldsymbol{l}}_j} + lap\left( {\frac{{{\Delta _1}\mathit{\boldsymbol{L}}}}{\varepsilon }} \right)} \right) - {\mathit{\boldsymbol{b}}_i}{\mathit{\boldsymbol{l}}_j}} \right)}^2}} } } \right) = }\\ {E\left( {\sum\limits_{i = 1}^m {\sum\limits_{i = 1}^n {{{\left( {{\mathit{\boldsymbol{b}}_i} \cdot lap\left( {\frac{{{\Delta _1}\mathit{\boldsymbol{L}}}}{\varepsilon }} \right)} \right)}^2}} } } \right) = }\\ {\left( {\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} } \right)\frac{{2{{\left( {{\Delta _1}\mathit{\boldsymbol{L}}} \right)}^2}}}{{{\varepsilon ^2}}}} \end{array} $ (7)

式中,${\Delta _1}\mathit{\boldsymbol{L}} = \mathop {\max }\limits_j \sum\limits_{i = 1}^k {\left| {{l_{ij}}} \right|} $。重写式(7)可得

$ LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = 2\left( {\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} \times {{\left( {{\Delta _1}\mathit{\boldsymbol{L}}} \right)}^2}} \right)\frac{1}{{{\varepsilon ^2}}} $

由于${\mathit{\boldsymbol{A}}_{m \times n}}\mathit{\boldsymbol{ = }}{\mathit{\boldsymbol{B}}_{m \times k}}{\mathit{\boldsymbol{L}}_{k \times n}}$分解的不唯一性,导致不同的$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。若要获得可用的人脸图像,只要获得较小的$\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} \times {\left({{\Delta _1}\mathit{\boldsymbol{L}}} \right)^2}$即可,较小的$\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} \times {\left({{\Delta _1}\mathit{\boldsymbol{L}}} \right)^2}$$\mathit{\boldsymbol{A}}$的最优分解${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{L}}_{k \times n}}$密切相关,本文结合随机梯度下降方法,将该问题转化为无约束最优化问题,利用正则化矩阵分解迭代地学习出${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{L}}_{k \times n}}$,求解该问题所需的目标函数为

$ \mathop {\min }\limits_{B,L} \sum\limits_{{a_{ij}} \in \mathit{\boldsymbol{A}}} {\left[ {{{\left( {{a_{ij}} - {\mathit{\boldsymbol{b}}_i}{{\left( {{{\mathit{\boldsymbol{\widetilde l}}}_j}} \right)}^{\rm{T}}}} \right)}^2} + \tau \left( {{{\left\| {{\mathit{\boldsymbol{b}}_i}} \right\|}^2} + {{\left\| {{{\mathit{\boldsymbol{\widetilde l}}}_j}} \right\|}^2}} \right)} \right]} $ (8)

式中,${\left({{{\mathit{\boldsymbol{\widetilde l}}}_j}} \right)^{\rm{T}}} = {\left({{\mathit{\boldsymbol{l}}_j}} \right)^{\rm{T}}} + lap{\left({{\Delta _1}\mathit{\boldsymbol{L/}}\varepsilon } \right)^k}$$\tau $表示正则因子。结合式(8),在正则化损失函数梯度相反的方向上同时更新${\mathit{\boldsymbol{B}}_{m \times k}}$${{\mathit{\boldsymbol{L}}_{k \times n}}}$,得

$ {\mathit{\boldsymbol{b}}_i} \leftarrow {\mathit{\boldsymbol{b}}_i} + \gamma \left( {{e_{ij}} \cdot \left( {{\mathit{\boldsymbol{l}}_j} + lap\left( {\frac{{{\Delta _1}\mathit{\boldsymbol{L}}}}{\varepsilon }} \right)} \right) - \tau \cdot {\mathit{\boldsymbol{b}}_i}} \right) $

$ {\mathit{\boldsymbol{l}}_j} \leftarrow {\mathit{\boldsymbol{l}}_j} + \gamma \left( {{e_{ij}} \cdot {\mathit{\boldsymbol{b}}_i} - \tau \cdot {\mathit{\boldsymbol{l}}_j}} \right) $

式中,${e_{ij}}$表示每次迭代产生的误差函数,${e_{ij}} = {a_{ij}} - {\mathit{\boldsymbol{b}}_i} \cdot {\left({{{\mathit{\boldsymbol{\widetilde l}}}_j}} \right)^{\rm{T}}}$, 常数$\gamma $用来确定最小化误差的速率,通常称为学习速率。

基于上述分析,本文提出LRA算法,具体步骤如下:

输入:人脸图像$\mathit{\boldsymbol{A}}$,隐私参数$\varepsilon $$k = rank\left(\mathit{\boldsymbol{A}} \right)$,随机梯度下降迭代次数$r$,学习速率参数$\gamma $,正则化因子参数$\tau $

输出:满足$\varepsilon $-差分隐私的人脸图像$\mathit{\boldsymbol{\widetilde A}}$

1) ${\mathit{\boldsymbol{A}}_{m \times n}}\mathit{\boldsymbol{ = }}{\mathit{\boldsymbol{B}}_{m \times k}}{\mathit{\boldsymbol{L}}_{k \times n}}$

2) $i = 1:k$

3) ${\rm{for}}\;{a_{ij}} \in \mathit{\boldsymbol{A}}$

4)     ${e_{ij}} = {a_{ij}} - {\mathit{\boldsymbol{b}}_i}\left({{\mathit{\boldsymbol{l}}_j} + lap{{\left({{\Delta _1}\mathit{\boldsymbol{L}}/\varepsilon } \right)}^k}} \right)$

5)     ${\mathit{\boldsymbol{b}}_i} \leftarrow {\mathit{\boldsymbol{b}}_i} + \gamma \left({{e_{ij}} \cdot \left({{\mathit{\boldsymbol{l}}_j} + lap{{\left({{\Delta _1}\mathit{\boldsymbol{L}}/\varepsilon } \right)}^k}} \right) - \tau \cdot {\mathit{\boldsymbol{b}}_i}} \right)$

6)     ${\mathit{\boldsymbol{l}}_j} \leftarrow {\mathit{\boldsymbol{l}}_j} + \gamma \left({{e_{ij}} \cdot {\mathit{\boldsymbol{b}}_i} - \tau \cdot {\mathit{\boldsymbol{l}}_j}} \right)$

7)     ${\mathit{\boldsymbol{\widetilde l}}_j} \leftarrow {\mathit{\boldsymbol{l}}_j} + lap{\left({{\Delta _1}\mathit{\boldsymbol{L}}/\varepsilon } \right)^k}$

8) ${\rm{return}}\;\;\;\;{\mathit{\boldsymbol{\widetilde A}}_{m \times n}}\mathit{\boldsymbol{ = }}{\mathit{\boldsymbol{B}}_{m \times k}}{\mathit{\boldsymbol{\widetilde L}}_{k \times n}}$

LRA算法采用随机梯度下降方法通过迭代得到$\mathit{\boldsymbol{\widetilde A}}$的最优分解${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{\widetilde L}}_{k \times n}}$,进而减小$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。结合Yuan等人(2015)提出的奇异值边界理论与式(8),得到定理1。

定理1  假设{λ1, λ2, …, λk}为矩阵$\mathit{\boldsymbol{A}}$对应的奇异值。若LRA采用随机梯度下降方法获得最优分解${\mathit{\boldsymbol{B}}_{m \times k}}$${\mathit{\boldsymbol{\widetilde L}}_{k \times n}}$,则$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的上界为$2\sum\limits_{i = 1}^k {\lambda _i^2} /{\varepsilon ^2}$

证明:由于${\mathit{\boldsymbol{A}}_{m \times n}}\mathit{\boldsymbol{ = }}{\mathit{\boldsymbol{B}}_{m \times k}}{\mathit{\boldsymbol{L}}_{k \times n}}$分解的不唯一性,假设$k = n$,此时${\mathit{\boldsymbol{A}}_{m \times n}}\mathit{\boldsymbol{ = }}{\mathit{\boldsymbol{B}}_{m \times n}}$${\mathit{\boldsymbol{L}}_{n \times n}} = \mathit{\boldsymbol{I}}$。因此,${\Delta _1}\mathit{\boldsymbol{L}} = 1$。根据矩阵$\mathit{\boldsymbol{A}}$的Frobenius范数定义可知,$\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} \times {\left({{\Delta _1}\mathit{\boldsymbol{L}}} \right)^2} = \left\| \mathit{\boldsymbol{A}} \right\|_{\rm{F}}^2$。由于$\left\| \mathit{\boldsymbol{A}} \right\|_{\rm{F}}^2 = \sum\limits_{i = 1}^k {\lambda _i^2} $成立,由低秩分解的$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right) = 2\left({\sum\limits_{ij} {\mathit{\boldsymbol{B}}_{ij}^2} \times {{\left({{\Delta _1}\mathit{\boldsymbol{L}}} \right)}^2}} \right)\frac{1}{{{\varepsilon ^2}}}$可知,LRA的误差上界为$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right) = 2\sum\limits_{i = 1}^k {\lambda _i^2} /{\varepsilon ^2}$

将LRA算法的误差上界与LAP的误差进行比较,得到定理2。

定理2  $L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right) \le L{E_{{\rm{LAP}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$

证明:设$L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$$L{E_{{\rm{LAP}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$分别为LRA和LAP算法产生的拉普拉斯误差。

由定理1可知,$L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的上界为$2\left\| \mathit{\boldsymbol{A}} \right\|_{\rm{F}}^2/{\varepsilon ^2}$,而$\left\| \mathit{\boldsymbol{A}} \right\|_{\rm{F}}^2 = \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^n {a_{ij}^2} } $,进而$L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的上界可以重写为$2 = \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^n {a_{ij}^2} } /{\varepsilon ^2}$

由式(6)可知,$L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right) = 2mn{\left({\frac{{{\Delta _1}\mathit{\boldsymbol{A}}}}{\varepsilon }} \right)^2}$,而${\Delta _1}\mathit{\boldsymbol{A = }}\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} $,则

$ L{E_{{\rm{LRA}}}}\left( {\mathit{\boldsymbol{\tilde A}}} \right) = 2mn{\left( {\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} } \right)^2}/{\varepsilon ^2} $

根据

$ \sum\limits_{j = 1}^n {{{\left( {\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} } \right)}^2}} \ge 2\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^n {a_{ij}^2} } $

可知

$ 2mn{\left( {\mathop {\max }\limits_j \sum\limits_{i = 1}^m {\left| {{a_{ij}}} \right|} } \right)^2}/{\varepsilon ^2} \ge 2\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^n {a_{ij}^2} } /{\varepsilon ^2} $

进而可以证明$L{E_{LRA}}\left({\mathit{\boldsymbol{\widetilde A}}} \right) \le L{E_{{\rm{LAP}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$

3.2.2 基于矩阵奇异值分解的发布方法

利用低秩分解可以获得较好的$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。然而,由于LRA算法使用迭代操作求取最终的最优分解,而迭代操作使得人脸图像的信息损失严重,进而导致较低的可用性。为了获得更低的$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,本文采用奇异值分解来压缩人脸图像。$\mathit{\boldsymbol{A}}$的奇异值分解为$\mathit{\boldsymbol{A = Q}}_A^{m \times m}\mathit{\boldsymbol{D}}_A^{m \times n}{\left({\mathit{\boldsymbol{P}}_A^{n \times n}} \right)^{\rm{T}}}$

根据奇异值分解的性质可知矩阵$\mathit{\boldsymbol{D}}_A^{m \times n}$中的奇异值按照从大到小排序,并且减少特别快,进而导致较少的奇异值之和占据了全部奇异值之和的大部分比例。基于此可以选择$r$个奇异值以及对应的左右奇异向量来近似描述矩阵$\mathit{\boldsymbol{A}}$,即

$ \mathit{\boldsymbol{A}} = \mathit{\boldsymbol{Q}}_A^{m \times m}\mathit{\boldsymbol{D}}_A^{m \times n}{\left( {\mathit{\boldsymbol{P}}_A^{n \times n}} \right)^{\rm{T}}} \approx \mathit{\boldsymbol{Q}}_A^{m \times r}\mathit{\boldsymbol{D}}_A^{r \times r}{\left( {\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}} $

基于此思想,提出SRA方法,具体步骤如下:

输入:人脸图像$\mathit{\boldsymbol{A}}$,隐私参数$\varepsilon $,启发式参数$r$

输出:满足$\varepsilon $-差分隐私的人脸图像。

1) $\mathit{\boldsymbol{A = Q}}_A^{m \times m}\mathit{\boldsymbol{D}}_A^{m \times n}{\left({\mathit{\boldsymbol{P}}_A^{n \times n}} \right)^{\rm{T}}}$//对图像进行奇异值分解

2) $\mathit{\boldsymbol{Q}}_A^{m \times r}\mathit{\boldsymbol{D}}_A^{r \times r}{\left({\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}} \leftarrow \mathit{\boldsymbol{A}}$//取${\mathit{\boldsymbol{D}}_A}$的前$r \times r$个数值

3) for $i$=1 :$r$

4)     for $j$=1 :$r$

5)       if $i$ = $j$ then

6)         $\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}\left({i, j} \right) \leftarrow \mathit{\boldsymbol{D}}_A^{r \times r}\left({i, j} \right) + lap\left({{\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r}/\varepsilon } \right)$

7) return$\mathit{\boldsymbol{\widetilde A}}\mathit{\boldsymbol{ = Q}}_A^{m \times r}\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}{\left({\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}}$

根据步骤2)可知,对奇异值分解得到的对角阵$\mathit{\boldsymbol{D}}_A^{m \times n}$启发式地取其前$r \times r$个矩阵数值得到矩阵$\mathit{\boldsymbol{D}}_A^{r \times r}$,添加拉普拉斯噪音得到对角阵的噪音矩阵$\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}$,再将其与$\mathit{\boldsymbol{Q}}_A^{m \times r}$${\left({\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}}$两个正交矩阵进行矩阵乘积运算,重构得到满足$\varepsilon $-差分隐私的人脸图像$\mathit{\boldsymbol{\widetilde A}}$,但在过程中忽视了$\left({m - r} \right) \times \left({n - r} \right)$个对角阵${\mathit{\boldsymbol{D}}_A}$的数值,产生了重构误差$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,本文采用平方和误差度量$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,具体为

$ \begin{array}{*{20}{c}} {RE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = \sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} + }\\ {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } + \sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } \end{array} $ (9)

式中

$ {\mathit{\boldsymbol{Q}}_A} = \left( {{\mathit{\boldsymbol{q}}_1}, \cdots ,{\mathit{\boldsymbol{q}}_m}} \right),{\mathit{\boldsymbol{q}}_i} = {\left( {{q_{1i}}, \cdots ,{q_{mi}}} \right)^{\rm{T}}} $

$ \mathit{\boldsymbol{P}}_A^{\rm{T}} = {\left( {{\mathit{\boldsymbol{p}}_1},{\mathit{\boldsymbol{p}}_2}, \cdots ,{\mathit{\boldsymbol{p}}_n}} \right)^{\rm{T}}},{\mathit{\boldsymbol{p}}_i} = \left( {{p_{i1}},{p_2}, \cdots ,{p_{in}}} \right) $

根据式(9)可知,由于对角阵${\mathit{\boldsymbol{D}}_A}$忽视了$\left({m - r} \right) \times \left({n - r} \right)$个矩阵数值,进而产生重构误差,其大小为$\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2} = \sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}\left({i, j} \right)} \right|}^2}} } } } $${\mathit{\boldsymbol{Q}}_A}$${\mathit{\boldsymbol{P}}_A}$分别忽视了$\left({m - r} \right)$列和$\left({n - r} \right)$行数值,二者产生的重构误差为

$ \sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} + \sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} + \sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{q}}_i}} \right|}^2}} + \sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{p}}_i}} \right|}^2}} $

根据SRA算法的步骤可知,对角阵前$r \times r$个矩阵数值$\mathit{\boldsymbol{D}}_A^{r \times r}$添加拉普拉斯噪音(步骤6)),因此SRA算法产生的噪音误差上界为

$ \begin{array}{l} LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = E{\left( {\mathit{\boldsymbol{\tilde A}} - \mathit{\boldsymbol{A}}} \right)^2} = \\ E\left( {\mathit{\boldsymbol{Q}}_A^{m \times r}\left( {\mathit{\boldsymbol{D}}_A^{r \times r} + lap\left( {\frac{{\Delta \mathit{\boldsymbol{D}}_A^{r \times r}}}{\varepsilon }} \right)} \right) \times } \right.\\ {\left. {{{\left( {\mathit{\boldsymbol{P}}_A^{r \times n}} \right)}^{\rm{T}}} - \mathit{\boldsymbol{Q}}_A^{m \times r}\mathit{\boldsymbol{D}}_A^{r \times r}{{\left( {\mathit{\boldsymbol{P}}_A^{r \times n}} \right)}^{\rm{T}}}} \right)^2} = \\ E{\left( {\mathit{\boldsymbol{Q}}_A^{m \times r}lap\left( {\frac{{\Delta \mathit{\boldsymbol{D}}_A^{r \times r}}}{\varepsilon }} \right){{\left( {\mathit{\boldsymbol{P}}_A^{r \times n}} \right)}^{\rm{T}}}} \right)^2} = \\ 2\sum\limits_{ij} {\left( {\mathit{\boldsymbol{Q}}_A^{m \times r}} \right)_{ij}^2} {\left( {\Delta \mathit{\boldsymbol{D}}_A^{r \times r}} \right)^2}\sum\limits_{ij} {\left( {{{\left( {\mathit{\boldsymbol{P}}_A^{r \times n}} \right)}^{\rm{T}}}} \right)_{ij}^2} /{\varepsilon ^2} = \\ 2\left\| \mathit{\boldsymbol{A}} \right\|_F^2/{\varepsilon ^2} = 2\sum\limits_{i = 1}^r {\lambda _i^2/{\varepsilon ^2}} \end{array} $ (10)

式中,${\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r} = \mathop {\max }\limits_j \sum\limits_{i = 1}^r {\left| {\mathit{\boldsymbol{D}}_A^{r \times r}\left({i, j} \right)} \right|} $

根据定理2,可得定理3。

定理3  $L{E_{{\rm{SRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right) \le L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$

证明:根据定理2可知,$L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的误差上界为$2\sum\limits_{i = 1}^k {\lambda _i^2} /{\varepsilon ^2}$,其中,$k\left({k \le \min \left\{ {m, n} \right\}} \right)$$\mathit{\boldsymbol{A}}$的秩。$L{E_{{\rm{SRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的误差上界为$2\sum\limits_{i = 1}^r {\lambda _i^2} /{\varepsilon ^2}$。然而启发式选取的$r$远小于$k$,可知$2\sum\limits_{i = 1}^k {\lambda _i^2} /{\varepsilon ^2} > 2\sum\limits_{i = 1}^r {\lambda _i^2} /{\varepsilon ^2}$

进而可知,$L{E_{{\rm{SRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right) \le L{E_{{\rm{LRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$成立。

根据式(9)和式(10),可得定理4。

定理4

$ \begin{array}{*{20}{c}} {E{r_{{\rm{SRA}}}}\left( {\mathit{\boldsymbol{\tilde A}}} \right) \le \sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } + } }\\ {\sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } + \sqrt 2 \sum\limits_{i = 1}^r {{\lambda _i}} /\varepsilon } \end{array} $

证明:用均方根误差度量$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,则

$ \begin{array}{*{20}{c}} {RE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = }\\ {\sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} + \sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } + \sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } \le }\\ {\sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + \sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } } \end{array} $

用均方根误差度量$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$的误差上界,则

$ LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) = \sqrt {2\sum\limits_{i = 1}^r {\lambda _i^2/{\varepsilon ^2}} } = \sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } /\varepsilon $

用均方根误差度量SRA算法的总体误差$E{r_{{\rm{SRA}}}}\left({\mathit{\boldsymbol{\widetilde A}}} \right)$,可得

$ \begin{array}{*{20}{c}} {E{r_{{\rm{SRA}}}}\left( {\mathit{\boldsymbol{\tilde A}}} \right) = E{{\left\| {\mathit{\boldsymbol{\tilde A}} - \mathit{\boldsymbol{A}}} \right\|}_2} = E\left( {\sqrt {\left\| {\mathit{\boldsymbol{\tilde A}} - \mathit{\boldsymbol{A}}} \right\|_2^2} } \right) \le }\\ {\sqrt {E\left( {\left\| {\mathit{\boldsymbol{\tilde A}} - \mathit{\boldsymbol{A}}} \right\|_2^2} \right)} = \sqrt {{{\left( {RE\left( {\mathit{\boldsymbol{\tilde A}}} \right)} \right)}^2} + {{\left( {LE\left( {\mathit{\boldsymbol{\tilde A}}} \right)} \right)}^2}} \le }\\ {RE\left( {\mathit{\boldsymbol{\tilde A}}} \right) + LE\left( {\mathit{\boldsymbol{\tilde A}}} \right) \le }\\ {\sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + \sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } + }\\ {\frac{{\sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } }}{\varepsilon }} \end{array} $

由定理4可知,当${\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r}$$r$值确定后,即可计算出${\mathit{\boldsymbol{\widetilde A}}}$的发布误差。其中,$r$值的选择至关重要。$r$值越大,$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$越小,而$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$越大;相反,$r$值越小,$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$越小,而$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$越大。这两种情况均使发布的人脸图像的可用性较差,因此需要选择合适的$r$值来均衡这两种误差。SRA方法启发式地选择$r$值,通常无法均衡$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。此外,如果直接基于原始矩阵$\mathit{\boldsymbol{A}}$选择$r$值,会破坏$\varepsilon $-差分隐私,原因是$r$值的选择是数据相关的。若要基于$\mathit{\boldsymbol{A}}$选择$r$值,需要满足$\varepsilon $-差分隐私。为此,本文基于指数机制(McSherry和Talwar,2007)提出一种$r$值选择的ESRA算法,具体步骤如下:

输入:人脸图像$\mathit{\boldsymbol{A}}$,隐私参数$\varepsilon $$\mathit{\boldsymbol{A}}$的秩$k$

输出:满足$\varepsilon $-差分隐私的人脸图像${\mathit{\boldsymbol{\widetilde A}}}$

1) $\mathit{\boldsymbol{A}} = \mathit{\boldsymbol{Q}}_A^{m \times m}\mathit{\boldsymbol{D}}_A^{m \times n}{\left({\mathit{\boldsymbol{P}}_A^{n \times n}} \right)^{\rm{T}}}$, 对图像进行奇异值分解

2) $\forall 1 \le r \le k$,计算每个$r$值的打分函数$U\left({\mathit{\boldsymbol{A}}, r} \right)$,其中,$U\left({\mathit{\boldsymbol{A}}, r} \right)$满足

$ \begin{array}{*{20}{c}} {U\left( {\mathit{\boldsymbol{A}},r} \right) = \sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + }\\ {\sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } + \frac{{\sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } }}{\varepsilon }} \end{array} $

3) 按概率$\Pr \left(r \right)$$\exp \left({ - \frac{{\varepsilon U\left({\mathit{\boldsymbol{A}}, r} \right)}}{{2{\Delta _1}U}}} \right)$选择一个合适的$r$值;

4) $\mathit{\boldsymbol{Q}}_A^{m \times r}\mathit{\boldsymbol{D}}_A^{r \times r}{\left({\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}} \leftarrow \mathit{\boldsymbol{A}}$, 取${\mathit{\boldsymbol{D}}_A}$的前$r \times r$个数值;

5) for $i$=1 :$r$

6)     for $j$=1 :$r$

7)       if $i$ = $j$ then

8)         $\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}\left({i, j} \right) \leftarrow \mathit{\boldsymbol{D}}_A^{r \times r}\left({i, j} \right) + lap\left({{\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r}/\varepsilon } \right)$

9) return  $\mathit{\boldsymbol{\widetilde A}}\mathit{\boldsymbol{ = Q}}_A^{m \times r}\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}{\left({\mathit{\boldsymbol{P}}_A^{r \times n}} \right)^{\rm{T}}}$

ESRA算法首先在$\mathit{\boldsymbol{D}}_A^{m \times n}$矩阵中计算每个$r$值可能的打分函数值(步骤2)),然后采用指数机制按概率$\Pr \left(r \right)$挑选合适的$r$值(步骤3)),并针对挑选的$r$值,采用拉普拉斯机制对$\mathit{\boldsymbol{D}}_A^{r \times r}$矩阵中每个对角数值进行噪音扰动(步骤5)—9)),最后运用矩阵乘积代数运算,获得$\mathit{\boldsymbol{\widetilde A}}$

ESRA算法利用指数机制实现$r$值的挑选,然而指数机制的关键是如何设计高效的打分函数,打分函数的优劣直接决定着挑选的$r$值的合理性,而合理的$r$值能够均衡$RE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$$LE\left({\mathit{\boldsymbol{\widetilde A}}} \right)$。本文利用定理4中的总体误差来设计打分函数$U\left({\mathit{\boldsymbol{A}}, r} \right)$$U\left({\mathit{\boldsymbol{A}}, r} \right)$越大说明总体误差越小,具体为

$ \begin{array}{*{20}{c}} {U\left( {\mathit{\boldsymbol{A}},r} \right) = \sqrt {\sum\limits_{i = s + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = s + 1}^m {\sum\limits_{j = s + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + }\\ {\sqrt {\sum\limits_{i = s + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } + \frac{{\sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } }}{\varepsilon }} \end{array} $ (11)

由式(11)与ESRA算法的步骤3)可知,每个$r$值的抽样概率为

$ \begin{array}{*{20}{c}} {\mathit{Pr}\left( r \right) \propto \exp \times }\\ {\left( { - \varepsilon \left( \begin{array}{l} \sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + \\ \sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } + \frac{{\sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } }}{\varepsilon } \end{array} \right)/2{\Delta _1}U} \right)} \end{array} $ (12)

式中,${\Delta _1}U$为打分函数$U\left({\mathit{\boldsymbol{A}}, r} \right)$的全局敏感度。为了度量${\Delta _1}U$,本文结合式(11)给出定理5。

定理5  ${\Delta _1}U \le {\Delta _1}{\mathit{\boldsymbol{Q}}_A} + 2{\Delta _1}{\mathit{\boldsymbol{D}}_A} + {\Delta _1}\mathit{\boldsymbol{P}}_A^{\rm{T}}$。其中,${\Delta _1}{\mathit{\boldsymbol{Q}}_A}$${\Delta _1}{\mathit{\boldsymbol{D}}_A}$${\Delta _1}\mathit{\boldsymbol{P}}_A^{\rm{T}}$分别为矩阵${\mathit{\boldsymbol{Q}}_A}$${\mathit{\boldsymbol{D}}_A}$$\mathit{\boldsymbol{P}}_A^{\rm{T}}$的最大列范数。

证明:由式(11)可知,${\Delta _1}U$由两部分组成:$\sqrt {\sum\limits_{i = r + 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = r + 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } } + \sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } $的敏感度${\Delta _1}{U_1}$$\sqrt 2 \sqrt {\sum\limits_{i = 1}^r {\lambda _i^2} } /\varepsilon $的敏感度${\Delta _1}{U_2}$,即$_1U = {\Delta _1}{U_1} + {\Delta _1}{U_2}$

首先,满足

$ \begin{array}{*{20}{c}} {{\Delta _1}{U_1} = {\Delta _1}\left( {\sqrt {\sum\limits_{i = 1}^m {{{\left| {{\mathit{\boldsymbol{Q}}_A}} \right|}^2}} } + \sqrt {\sum\limits_{i = r + 1}^m {\sum\limits_{j = 1}^n {{{\left| {{\mathit{\boldsymbol{D}}_A}} \right|}^2}} } + } } \right.}\\ {\left. {\sqrt {\sum\limits_{i = r + 1}^n {{{\left| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right|}^2}} } } \right) \le }\\ {{\Delta _1}\left( {{{\left\| {{\mathit{\boldsymbol{Q}}_A}} \right\|}_2}} \right) + {\Delta _1}\left( {{{\left\| {{\mathit{\boldsymbol{D}}_A}} \right\|}_2}} \right) + {\Delta _1}\left( {{{\left\| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right\|}_2}} \right)} \end{array} $

放缩上式可得

$ \begin{array}{*{20}{c}} {{\Delta _1}\left( {{{\left\| {{\mathit{\boldsymbol{Q}}_A}} \right\|}_2}} \right) = \max \left( {\left| {{{\left\| {{\mathit{\boldsymbol{D}}_A}} \right\|}_2} - {{\left\| {{{\mathit{\boldsymbol{D'}}}_A}} \right\|}_2}} \right|} \right) \le }\\ {\max \left( {{{\left\| {{\mathit{\boldsymbol{D}}_A} - {{\mathit{\boldsymbol{D'}}}_A}} \right\|}_2}} \right) \le }\\ {\max \left( {{{\left\| {{\mathit{\boldsymbol{D}}_A} - {{\mathit{\boldsymbol{D'}}}_A}} \right\|}_1}} \right) \le {\Delta _1}{\mathit{\boldsymbol{D}}_A}} \end{array} $

同理

$ {\Delta _1}\left( {{{\left\| {{\mathit{\boldsymbol{Q}}_A}} \right\|}_2}} \right) \le {\Delta _1}{\mathit{\boldsymbol{Q}}_A} $

$ {\Delta _1}\left( {{{\left\| {\mathit{\boldsymbol{P}}_A^{\rm{T}}} \right\|}_2}} \right) \le {\Delta _1}\mathit{\boldsymbol{P}}_A^{\rm{T}} $

进而推理可得

$ {\Delta _1}{U_1} \le {\Delta _1}{\mathit{\boldsymbol{Q}}_A} + {\Delta _1}{\mathit{\boldsymbol{D}}_A} + {\Delta _1}\mathit{\boldsymbol{P}}_A^{\rm{T}} $ (13)

由L1范数不等式推理可得

$ {\Delta _1}{U_2} = {\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r} \le {\Delta _1}{\mathit{\boldsymbol{D}}_A} $ (14)

由式(13)和式(14)推理可得

$ {\Delta _1}U \le {\Delta _1}{\mathit{\boldsymbol{Q}}_A} + 2{\Delta _1}{\mathit{\boldsymbol{D}}_A} + {\Delta _1}\mathit{\boldsymbol{P}}_A^{\rm{T}} $

成立。

3.2.3 基于矩阵$\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}$的求精处理方法

根据SRA算法的步骤6)和ESRA算法的步骤8)可知,通过添加拉普拉斯噪音$lap\left({{\Delta _1}\mathit{\boldsymbol{D}}_A^{r \times r}/\varepsilon } \right)$扰动$\mathit{\boldsymbol{D}}_A^{r \times r}$中的奇异值,进而得到$\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}$。设$\mathit{\boldsymbol{D}}_A^{r \times r}$中的奇异值为$\left\{ {{\mathit{\boldsymbol{\lambda }}_1}, {\mathit{\boldsymbol{\lambda }}_2}, \cdots, {\mathit{\boldsymbol{\lambda }}_r}} \right\}$,且从大到小排序。扰动后噪音矩阵$\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}$中的奇异值为$\left\{ {{{\mathit{\boldsymbol{\widetilde \lambda }}}_1}, {{\mathit{\boldsymbol{\widetilde \lambda }}}_2}, \cdots, {{\mathit{\boldsymbol{\widetilde \lambda }}}_r}} \right\}$。然而,由于添加拉普拉斯噪音的原因,$\left\{ {{{\mathit{\boldsymbol{\widetilde \lambda }}}_1}, {{\mathit{\boldsymbol{\widetilde \lambda }}}_2}, \cdots, {{\mathit{\boldsymbol{\widetilde \lambda }}}_r}} \right\}$中有些值可能出现乱序,例如,$\left\{ {{{\mathit{\boldsymbol{\widetilde \lambda }}}_1}, {{\mathit{\boldsymbol{\widetilde \lambda }}}_2}, {{\mathit{\boldsymbol{\widetilde \lambda }}}_3}} \right\} = \left\{ {10, 14, 9} \right\}$。此时$\mathit{\boldsymbol{\widetilde D}}_A^{r \times r}$中的奇异值不是按照从大到小排序,为了满足奇异值的特性,需要对其进行后置求精处理,本文对于此乱序现象采取凸优化方法进行处理。

$\left\{ {{{\mathit{\boldsymbol{\overline \lambda }} }_1}, {{\mathit{\boldsymbol{\overline \lambda }} }_2}, \cdots, {{\mathit{\boldsymbol{\overline \lambda }} }_r}} \right\}$为求精处理后的奇异值序列,则相应的凸优化问题可转换化为

$ \begin{array}{*{20}{c}} {\min \sum\limits_{i = 1}^r {{{\left( {{{\tilde \lambda }_i} - {{\bar \lambda }_i}} \right)}^2}} }\\ {{{\bar \lambda }_i} \ge {{\bar \lambda }_{i + 1}},1 \le i \le r} \end{array} $

根据定理6可求出上述目标函数的可行解。

定理6  设

$ {I_m} = \mathop {\min }\limits_{j \in \left[ {m,r} \right]} \mathop {\max }\limits_{i \in \left[ {1,j} \right]} \;{\rm{mean}}\left[ {i,j} \right] $

$ {N_m} = \mathop {\max }\limits_{i \in \left[ {1,m} \right]} \mathop {\min }\limits_{j \in \left[ {i,r} \right]} \;{\rm{mean}}\left[ {i,j} \right] $

则在条件${\mathit{\boldsymbol{\overline \lambda }} _i} \ge {\mathit{\boldsymbol{\overline \lambda }} _{i + 1}}$约束下的可行解为

$ \left\{ {{{\bar \lambda }_1},{{\bar \lambda }_2}, \cdots ,{{\bar \lambda }_r}} \right\} = \left\{ {{I_1}, \cdots ,{I_r}} \right\} = \left\{ {{U_1}, \cdots ,{U_r}} \right\} $

其中,${\rm{mean}}\left[ {i, j} \right] = {\sum\limits_{h = i}^j {\mathit{\boldsymbol{\widetilde \lambda }}} _h}/j - i + 1$(Hay等,2010)。

例如,$\left\{ {{{\mathit{\boldsymbol{\widetilde \lambda }}}_1}{{\mathit{\boldsymbol{\widetilde \lambda }}}_2}{{\mathit{\boldsymbol{\widetilde \lambda }}}_3}} \right\} = \left\{ {10, 14, 9} \right\}$,后置求精后为$\left\{ {{{\mathit{\boldsymbol{\overline \lambda }} }_1}, {{\mathit{\boldsymbol{\overline \lambda }} }_2}, {{\mathit{\boldsymbol{\overline \lambda }} }_3}} \right\} = \left\{ {12, 12, 9} \right\}$

综合定理1—6及LAP、LRA、SRA和ESRA算法,可得定理7。

定理7  LAP、LRA、SRA和ESRA算法均满足$\varepsilon $-差分隐私。

证明:根据拉普拉斯机制(Dwork等,2006),LAP、LRA和SRA算法满足$\varepsilon $-差分隐私。下面根据指数机制(McSherry和Talwar,2007)和差分隐私序列组合性质(McSherry,2010)证明ESRA算法满足$\varepsilon $-差分隐私。

假设ESRA算法步骤3)需要的隐私预算为${{\varepsilon _1}}$,则根据指数机制(McSherry和Talwar,2007),选择每个$r$值的概率为

$ \mathit{Pr}\left( r \right) = \frac{{\exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},r} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{r' \in \left[ {1,k} \right]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},r'} \right)}}{{2{\Delta _1}U}}} \right)}} $ (15)

给定$\mathit{\boldsymbol{A}}$及其近邻$\mathit{\boldsymbol{A}}'$,对任意$r$$\left({r \in \left[ {1, k} \right]} \right)$,根据式(15),可得

$ \begin{array}{*{20}{c}} {\frac{{\mathit{Pr}\left( {ESRA\left( {\mathit{\boldsymbol{A}},r} \right)} \right)}}{{\mathit{Pr}\left( {ESRA\left( {\mathit{\boldsymbol{A'}},r} \right)} \right)}} = \frac{{\frac{{\exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},r} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{r' \in \left[ {1,k} \right]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\left( {\mathit{\boldsymbol{A}},r'} \right)} \right.}}{{2{\Delta _1}U}}} \right)}}}}{{\frac{{\exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A'}},r} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{r' \in \left[ {1,k} \right]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A'}},r'} \right)}}{{2{\Delta _1}U}}} \right)}}}} = }\\ {\left( {\frac{{\exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},r} \right)}}{{2{\Delta _1}U}}} \right)}}{{\exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A'}},r} \right)}}{{2{\Delta _1}U}}} \right)}}} \right) \times \left( {\frac{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A'}},r'} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},r'} \right)}}{{2{\Delta _1}U}}} \right)}}} \right) \le }\\ {\exp \left( {\frac{{{\varepsilon _1}}}{2}} \right) \times \left( {\frac{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( {\frac{{{\varepsilon _1}}}{2}} \right) \times \exp \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},{r^\prime }} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},{r^\prime }} \right)}}{{2{\Delta _1}U}}} \right)}}} \right) \le }\\ {\exp \left( {\frac{{{\varepsilon _1}}}{2}} \right) \times \exp \left( {\frac{{{\varepsilon _1}}}{2}} \right) \times \left( {\frac{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},{r^\prime }} \right)}}{{2{\Delta _1}U}}} \right)}}{{\sum\limits_{{r^\prime } \in [1,k]} {\exp } \left( { - \frac{{{\varepsilon _1}U\left( {\mathit{\boldsymbol{A}},{r^\prime }} \right)}}{{2{\Delta _1}U}}} \right)}}} \right) = }\\ {\exp \left( {{\varepsilon _1}} \right)} \end{array} $

假设ESRA算法步骤5)—9)需要的隐私预算为${\varepsilon _2}$,由拉普拉斯机制(Dwork等,2006)可知,步骤5)—9)满足${\varepsilon _2}$-差分隐私。

因此,根据差分隐私序列组合性质(McSherry,2010)可知,ESRA算法满足$\varepsilon $-差分隐私,其中,$\varepsilon = {\varepsilon _1} + {\varepsilon _2}$

4 实验与结果分析

为了研究算法的可用性,首先在Lena的灰度人脸图像上进行实验(如图 1所示)。

图 1 Lena原始图像
Fig. 1 Original image of Lena

图 2显示了6种算法的发布结果。从图 2可以看出,ESRA算法的效果最好。为了验证算法的有效性与可用性,基于真实人脸库,采用支持向量机(support vector machine,SVM)+主成分分析(principal component analysis,PCA)方法对上述6种方法进行检验,同时为了度量单幅图像的发布精度,采用信息熵度量上述6种方法处理后的图像信息损失。

图 2 Lena加噪图像
Fig. 2 Add noise images of Lena ((a) LAP; (b) LRM; (c) MM; (d) LRA; (e) SRA; (f) ESRA)

4.1 人脸数据库

人脸数据库主要包括CMU、ORL、Yale、Yale B、Faces95和Faces94人脸库等6种数据集,具体细节如表 1所示。为了采用SVM + PCA对上述6种数据集进行有效分类,本文对数据集中每个人拥有的人脸图像数目进行均分,其中前半部分作为训练集,后半部分作为测试集。例如,CMU中每人有32幅图像,前16幅图像作为训练集,后16幅作为测试集。采用准确率、召回率以及F1-Score等3种可用性度量标准来衡量上述6种算法的可用性。

表 1 6种数据集信息描述
Table 1 Description of six dataset information

下载CSV
数据集 数据集大小 包含人数 每人拥有图像数 图像大小/像素 像素点数
Yale 150 15 10 231×195 45 045
ORL 400 40 10 112×92 10 304
CMU 640 20 32 120×128 15 360
Yale B 900 30 30 192×168 32 256
Faces95 1 440 72 20 200×180 36 000
Faces94 3 040 152 20 200×180 36 000

4.2 实验结果分析

SVM + PCA方法首先采用主成分分析法降维并去除数据之间的相关性,其次采用SVM训练器对数据集进行训练,最后采用SVM产生的分类函数对测试数据集进行分类,并计算分类的正确率(包括准确率、召回率、F1-Score)。$\varepsilon $分别取0.1、0.5、0.9、1.3,$r$= 100,$\gamma $ = 0.01,$\tau $ = 0.1。利用non-privacy算法作为参照方法,该方法的实验结果是在原始人脸图像上的识别结果,以此为基准,哪种方法的实验结果越接近non-privacy的实验结果,说明哪种方法的效果越好。

4.2.1 基于$\varepsilon $变化的可用性对比分析

首先分析隐私预算$\varepsilon $变化对上述6种方法可用性的影响。实验结果如图 3图 8所示。可以看出:1)LRM算法在6种数据集上的可用性最差,因为LRM算法采用优化问题寻找矩阵最佳分解时,过多的迭代操作造成矩阵中的真实值被严重扭曲。2)LAP算法虽然在原始人脸矩阵中直接添加拉普拉斯噪音,但可用性优于LRM算法。3)LRA算法尽管同样采用迭代操作寻找矩阵的最佳分解,然而在6种数据集上,其可用性优于LRM和LAP算法,因为LRA算法基于无约束最优化问题,以较少的迭代次数学习出矩阵的最优分解。例如,$\varepsilon $= 0.1时,LRA算法在CMU数据集的准确率是LAP算法和LRM算法的1倍多。4)MM算法的可用性优于LRA算法,因为MM算法是直接在原始图像上进行噪声处理,不需要经过迭代操作,避免了图像信息的严重损失。5)SRA算法比LRA算法在可用性上有进一步的提高,因为在对图像进行噪音处理之前,SRA算法采用奇异值分解对图像数据先进行压缩,降低了全局敏感性对噪音大小的影响。例如,$\varepsilon $= 0.1时,SRA算法在Yale数据集上的准确率、召回率和F1-Score分别是LRA算法的3倍多、1倍多和2倍多。6)ESRA算法在6种数据集上的可用性表现最好,因为ESRA算法基于指数机制有效抽取合适的奇异值对原始矩阵进行合理压缩,有效均衡了拉普拉斯误差与矩阵的重构误差,提高了人脸图像的发布精度。例如,$\varepsilon $= 0.1时,ESRA算法在数据集Faces95上的准确率分别是LAP、LRM和MM算法的45倍、22倍多和1倍多。与其他5种算法相比,ESRA算法对于数据集大小的变化表现相对稳定,正确性最好。

图 3 在CMU数据集上$\varepsilon $变化时可用性变化情况
Fig. 3 Usability changes of CMU dataset when vary $\varepsilon $((a) accurate; (b) recall; (c) F1-Score)
图 4 在ORL数据集上$\varepsilon $变化时可用性变化情况
Fig. 4 Usability changes of ORL dataset when vary $\varepsilon $ ((a) accurate; (b) recall; (c) F1-Score)
图 5 在Yale数据集上$\varepsilon $变化时可用性变化情况
Fig. 5 Usability changes of Yale dataset when vary $\varepsilon $ ((a) accurate; (b) recall; (c) F1-Score)
图 6 在Yale B数据集上$\varepsilon $变化时可用性变化情况
Fig. 6 Usability changes of Yale B dataset when vary $\varepsilon $ ((a) accurate; (b) recall; (c) F1-Score)
图 7 在Faces95数据集上$\varepsilon $变化时可用性变化情况
Fig. 7 Usability changes of Faces95 dataset when vary $\varepsilon $ ((a) accurate; (b) recall; (c) F1-Score)
图 8 在Faces94数据集上$\varepsilon $变化时可用性变化情况
Fig. 8 Usability changes of Faces94 dataset when vary $\varepsilon $ ((a) accurate; (b) recall; (c) F1-Score)

4.2.2 基于$\varepsilon $变化的单幅图像发布精度对比分析

为了有效度量单幅人脸图像蕴含的信息量,采用信息熵$H$作为度量标准,$H = \sum\limits_{i = 0}^{255} {{p_i}\log {p_i}} $,其中${{p_i}}$表示图像中每个像素点所占的比率。Np-entropy表示原始图像的信息熵,哪种算法的信息熵越接近Np-entropy,表示哪种算法发布图像的可用性越高。在6种人脸图像库随机抽取1幅人脸图像,分析$\varepsilon $变化对单幅图像信息熵的影响,结果如图 9所示,可以看出,经过LRM算法处理过的信息熵最小,原因是LRM算法过多的迭代次数导致严重的信息损失;SRA与ESRA算法比较接近Np-entropy值,原因是奇异值分解能够充分压缩原始图像,并且能够保留原始图像的主要信息;MM算法介于SRA算法与LRA算法之间,并在$\varepsilon $大于0.5时,逐渐接近SRA算法的表现,因为MM算法利用矩阵机制处理图像时,无需迭代处理。从6种数据集单幅图像的表现来看,SRA与ESRA表现比较好,优于其他4种算法。

图 9 6种数据集下$\varepsilon $变化时单幅图像可用性情况
Fig. 9 Single image usability changes of six datasets when vary $\varepsilon $
((a) CMU; (b) ORL; (c) Yale B; (d) Yale; (e) Faces95; (f) Faces94)

5 结论

针对灰度人脸图像发布中的隐私保护问题,首先提出了一种基于矩阵分解的人脸图像发布算法LRA。LRA算法将矩阵分解问题当做一种无约束的最优化问题,然后利用正则化矩阵分解,以较少的迭代次数学习出原始矩阵的最优分解。随后分析了LRA算法的不足,并提出了两种基于矩阵奇异值分解的人脸图像算法SRA与ESRA。SRA算法利用启发式设定奇异值的个数,根据设定的奇异值个数压缩原始矩阵。ESRA算法根据启发式设定奇异值个数的不足,利用指数机制合理地抽取合适的奇异值个数,对原始矩阵进行有效压缩。实验结果表明,LRA、SRA和ESRA算法在达到人脸图像保护效果的同时具有较好的可用性。

今后的研究方向应考虑如下两个方面:1)如何利用树结构(四分树、KD-树等)对人脸图像进行分割,根据分割过程与结果合理地使用噪音机制;2)由于人脸图像与其图像特征密切关联,如何在满足差分隐私的情况下获取人脸特征,并根据这些隐私人脸特征重构人脸图像,同时应具有较好的分类特性。

参考文献

  • Chen Y, Machanavajjhala A, Hay M and Miklau G. 2017. PeGaSus: data-adaptive differentially private stream processing//Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security. Dallas, TX, USA: ACM: 1375-1388[DOI: 10.1145/3133956.3134102]
  • Dwork C. 2006. Differential privacy//Proceedings of the 33rd International Colloquium on Automata, Languages and Programming (ICALP). Venice, Italy: Springer: 1-12[DOI: 10.1007/11787006]
  • Dwork C and Lei J. 2009. Differential privacy and robust statistics//The 41st Annual ACM Symposium on Theory of Computing (STOC). Bethesda, MD, USA: ACM: 371-380[DOI: 10.1145/1536414.1536466]
  • Dwork C, McSherry F, Nissim K and Smith A. 2006. Calibrating noise to sensitivity in private data analysis//Proceedings of the 3rd Theory of Cryptography Conference (TCC). New York, USA: Springer: 363-385[DOI: 10.1007/9783540327.32532]
  • Friedman A, Berkovsky S, Kâafar M A. 2016. A differential privacy framework for matrix factorization recommender systems. User Modeling and User-Adapted Interaction, 26(5): 425-458 [DOI:10.1007/s11257-016-9177-7]
  • Gross R and Acquisti A. 2005. Information revelation and privacy in online social networks//Proceedings of 2005 ACM Workshop on Privacy in the Electronic Society. Alexandria, VA, USA: ACM: 71-80[DOI: 10.1145/1102199.1102214]
  • Gross R, Sweeney L, Cohn J, de la Torre F and Baker S. 2008. Model-based de-identification of facial images//2008 American Medical Informatics Association Annual Symposium. Washington: [s.n.]: 262-273
  • Hay M, Rastogi V, Miklau G, Suciu D. 2010. Boosting the accuracy of differentially private histograms through consistency. Proceedings of the VLDB Endowment, 3(1/2): 1021-1032 [DOI:10.14778/1920841.1920970]
  • Ilia P, Polakis I, Athanasopoulos E, Maggi F and Ioannidis S. 2015. Face/off: preventing privacy leakage from photos in social networks//Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. Denver, Colorado, USA: ACM: 781-792[DOI: 10.1145/2810103.2813603]
  • Li C, Miklau G, Hay M, McGregor A, Rastogi V. 2015. The matrix mechanism:optimizing linear counting queries under differential privacy. The VLDB Journal, 24(6): 757-781 [DOI:10.1007/s00778-015-0398-x]
  • McSherry F. 2010. Privacy integrated queries:an extensible platform for privacy-preserving data analysis. Communications of the ACM, 53(9): 89-97 [DOI:10.1145/1810891.1810916]
  • McSherry F and Talwar K. 2007. Mechanism design via differential privacy//Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science. Providence, RI, USA: IEEE: 94-103[DOI: 10.1109/FOCS.2007.66]
  • Newton E M, Sweeney L, Malin B. 2005. Preserving privacy by de-identifying face images. IEEE Transactions on Knowledge and Data Engineering, 17(2): 232-243 [DOI:10.1109/TKDE.2005.32]
  • Qardaji W, Yang W N and Li N H. 2013. Differentially private grids for geospatial data//Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE). Brisbane, QLD, Australia: IEEE: 32-33[DOI: 10.1109/ICDE.2013.6544872]
  • Rastogi V and Nath S. 2010. Differentially private aggregation of distributed time-series with transformation and encryption//Proceedings of 2010 ACM SIGMOD International Conference on Management of data. Indianapolis, Indiana, USA: ACM: 735-746[DOI: 10.1145/1807167.1807247]
  • Sadeghi A R, Schneider T and Wehrenberg I. 2009. Efficient privacy-preserving face recognition//Proceedings of the 12th International Conference on Information Security and Cryptology. Seoul, Korea: Springer: 229-244[DOI: 10.1007/978-3-642-14423-3_16]
  • Su S, Tang P, Cheng X, Chen R and Wu Z Q. 2016. Differentially private multi-party high-dimensional data publishing//Proceedings of the 32nd IEEE International Conference on Data Engineering. Helsinki, Finland: IEEE: 205-216[DOI: 10.1109/ICDE.2016.7498241]
  • Xiao X K, Wang G Z, Gehrke J. 2011. Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering, 23(8): 1200-1214 [DOI:10.1109/TKDE.2010.247]
  • Xu J, Zhang Z J, Xiao X K, Yang Y, Yu G, Winslett M. 2013. Differentially private histogram publication. The VLDB Journal, 22(6): 797-822 [DOI:10.1007/s00778-013-0309-y]
  • Yuan G Z, Yang Y, Zhang Z J and Hao Z F. 2016. Convex optimization for linear query processing under approximate differential privacy//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). San Francisco, CA, USA: ACM: 2005-2014[DOI: 10.1145/2939672.2939818]
  • Yuan G Z, Zhang Z J, Winslett M, Xiao X K, Yang Y, Hao Z F. 2015. Optimizing batch linear queries under exact and approximate differential privacy. ACM Transactions on Database Systems, 40(2): 1-47 [DOI:10.1145/2699501]
  • Zhang J, Cormode G, Procopiuc C M, Srivastava D and Xiao X K. 2015. Private release of graph statistics using ladder functions//Proceedings of 2015 ACM SIGMOD International Conference on Management of Data. Melbourne, Victoria, Australia: ACM: 731-745[DOI: 10.1145/2723372.2737785]