面向人脸图像发布的差分隐私保护

张啸剑; 付聪聪; 孟小峰

doi:10.11834/jig.170647

图像处理和编码 | 浏览量 : 0 下载量: 110 CSCD: 2

PDF
导出
分享
收藏
专辑

面向人脸图像发布的差分隐私保护
Facial image publication with differential privacy
2018年23卷第9期页码：1305-1315
收稿：2017-12-22，

修回：2018-3-22，

纸质出版：2018-09-16
DOI： 10.11834/jig.170647
稿件说明：

移动端阅览

张啸剑, 付聪聪, 孟小峰. 面向人脸图像发布的差分隐私保护[J]. 中国图象图形学报, 2018,23(9):1305-1315. DOI： 10.11834/jig.170647.

Xiaojian Zhang, Congcong Fu, Xiaofeng Meng. Facial image publication with differential privacy[J]. Journal of Image and Graphics, 2018, 23(9): 1305-1315. DOI： 10.11834/jig.170647.

摘要

目的

由于人脸图像蕴含着丰富的个人敏感信息，直接发布出来可能会造成个人的隐私泄露。为了保护人脸图像中的隐私信息，本文提出了一种基于傅里叶变换与差分隐私技术相结合的人脸图像发布方法FIP（facial image publication）。

方法

将人脸图像作为实数域2维矩阵，充分利用离散傅里叶变换技术压缩图像。为了有效均衡由拉普拉斯机制引起的噪音误差以及由傅里叶变换导致的重构误差，引入一种基于指数机制的傅里叶系数选择方法EMK（exponential mechanism-based

$$k$$

coefficients sampling），它能够在不同的系数空间中挑选出合理的傅里叶系数来压缩人脸图像，然后利用拉普拉斯机制对所挑选出的系数添加噪音，进而使整个处理过程满足

$$\varepsilon $$

-差分隐私。此外，为了避免较大的傅里叶系数空间导致指数机制挑选系数不准确问题，基于离散实数傅里叶变换的共轭对称特性，提出了一种增强的指数机制挑选傅里叶系数方法BEMK（boosted exponential mechanism-based

$$k$$

coefficients sampling），该方法不仅进一步压缩离散傅里叶系数空间，而且还能够提高人脸图像发布的精度。

结果

基于4种真实人脸图像数据集采用支持向量机分类与采用主成分分析技术验证方法的正确性。从算法的准确率、召回率，以及F1-Score度量结果显示，提出的基于离散傅里叶变换技术的人脸图像发布方法均优于直接采用拉普拉斯机制的发布方法LAP（Laplace mechanism-based publication）。

结论

实验结果表明，本文方法能够实现满足

$$\varepsilon $$

-差分隐私的敏感人脸图像发布，图像分类验证其具有较高的可用性。特别是BEMK方法具有较好的鲁棒性，是一种有效的隐私人脸图像发布方法。

Abstract

Objective

Facial image publication (FIP) in a direct way may lead to privacy leakage of individuals because facial images are inherently sensitive. To protect the private information in facial images

this paper proposes an efficient publishing algorithm called FIP that is based on Fourier transform combined with differential privacy

which is the state-of-the-art model to address privacy concerns.

Method

First

this algorithm uses the real-valued matrix to model the facial image

in which each cell corresponds to each pixel point of the image. Then

on the basis of the matrix

this algorithm relies on the Fourier transform technique to extract the Fourier coefficients (e.g.

a pre-defined limit

$$k$$

on the coefficients sampled) and then uses the Laplace mechanism to inject noise into each coefficient to ensure differential privacy. Finally

this algorithm uses Fourier inverse transform to reconstruct the noisy facial image. However

in the FIP algorithm

we encounter two sources of errors:1) the Laplace error (LE) due to the Laplace noise injected and 2) the reconstruction error (RE) caused by the lossy compression of Fourier transform. The selection of

$$k$$

is a serious dilemma:for the FIP algorithm to produce the low LE

$$k$$

cannot be large

whereas a small

$$ $$

causes the RE to be extremely large. However

increasing

$$k$$

would cause RE to be small but LE to be extremely large. Furthermore

$$ $$

cannot be directly tuned on the basis of facial images; otherwise

the selection of

$$k$$

itself reveals private information in facial images and violates differential privacy. Therefore

a differentially private

$$k$$

value is vital in balancing the LE and RE in sanitized facial images. To remedy the deficiency of FIP

we present exponential mechanism-based

$$k$$

coefficient sampling (EMK)

$$k$$

coefficient sampling algorithm that adopts exponential mechanism to select the suitable coefficients but eliminates the dependency on a pre-defined

$$k$$

. The core of EMK is to sample

$$k$$

coefficients first by using a portion of the privacy budget in different candidate coefficients set via exponential mechanism and uses the Laplace noise to perturb the

$$k$$

samples by utilizing the remaining budget. On the basis of the sequential composition of differential privacy

the two phases meet

$$\varepsilon $$

-differential privacy. The EMK algorithm

however

does not exploit the correlation over all coefficients

which may generate a large candidate sampling set and an inaccurate

$$k$$

selection. We notice that the discrete real-valued Fourier coefficients are correlated as they are half-redundant. On the basis of this observation

a boosted EMK (BEMK) is proposed to address the problem in EMK. The main idea of BEMK is to use the conjugate symmetry of discrete real Fourier transform to compress the candidate set and adopt the exponential mechanism to select the

$$k$$

coefficients in the compressed candidate set.

Result

On the basis of the SVM classification and principal component analysis technique

comprehensive experiments were conducted over four real facial image datasets ((CMU)

(ORL)

Yale

and YaleB) to evaluate the quality of the facial images generated from the BEMK

EMK

FIP

and LAP algorithms using a variety of metrics

including precision

recall

and F1 score. Our experiments show that the proposed BEMK

EMK

and FIP algorithms outperform LAP in terms of the abovementioned four metrics. BEMK applies to the four datasets and achieves better accuracy than EMK and FIP. For example

on the CMU dataset

we fix the matrix=128×128 and vary the privacy budget

$$\varepsilon $$

(i.e.

0.1

0.5

0.9

and 1.4) to study the accuracy of each algorithm. Tables 1-12 show the results. As expected

the accuracy measures of all algorithms increase when

$$\varepsilon $$

increases. When

$$\varepsilon $$

varies from 0.1 to 1.4

BEMK still achieves a better precision

recall

and F1 score than the other algorithms. Their values are 88%

90%

and 89%

respectively.

Conclusion

We provide both in-depth theoretical analysis and extensive experiments to compare BEMK with EMK

FIP

and LAP. Results show that the proposed algorithms can overcome the privacy leakage of FIP. BEMK significantly improves compared with the other three algorithms. Moreover

BEMK also maintains good robustness and generates high-quality synthetic facial images while still satisfying differential privacy.

关键词

Keywords

references

Newton E M, Sweeney L, Malin B. Preserving privacy by de-identifying face images[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(2):232-243. [DOI:10.1109/TKDE.2005.32]

Gross R, Sweeney L, Cohn J F, et al. Model-based de-identification of facial images[C]//2008 American Medical Informatics Association Annual Symposium. Washington, DC, USA: AMIA, 2008: 38-47.

Gross R, Acquisti A. Information revelation and privacy in online social networks[C]//Proceedings of 2005 ACM Workshop on Privacy in the Electronic Society. Alexandria, VA, USA: ACM, 2005: 71-80. [ DOI:10.1145/1102199.1102214 http://dx.doi.org/10.1145/1102199.1102214 ]

Sadeghi A R, Schneider T, Wehrenberg I. Efficient privacy-preserving face recognition[C]//Proceedings of the 12th International Conference on Information, Security and Cryptology. Seoul, Korea: Springer, 2009: 507. [ DOI:10.1007/978-3-642-14423-3_16 http://dx.doi.org/10.1007/978-3-642-14423-3_16 ]

Xu J, Zhang Z J, Xiao X K, et al. Differentially private histogram publication[J]. The VLDB Journal, 2013, 22(6):797-822. [DOI:10.1007/s00778-013-0309-y]

Qardaji W, Yang W N, Li N H. Differentially private grids for geospatial data[C]//Proceedings of the 29th International Conference on Data Engineering. Brisbane, Australia: IEEE, 2013: 32-33. [ DOI:10.1109/ICDE.2013.6544872 http://dx.doi.org/10.1109/ICDE.2013.6544872 ]

Zhang J, Cormode G, Procopiuc C M, et al. Private release of graph statistics using ladder functions[C]//Proceedings of 2015 ACM SIGMOD International Conference on Management of Data. Melbourne, Victoria, Australia: ACM, 2015: 731-745. [ DOI:10.1145/2723372.2737785 http://dx.doi.org/10.1145/2723372.2737785 ]

Chen Y, Machanavajjhala A, Hay M, et al. PeGaSus: data-adaptive differentially private stream processing[C]//Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security. Dallas, TX, USA: ACM, 2017: 1375-1388. [ DOI:10.1145/3133956.3134102 http://dx.doi.org/10.1145/3133956.3134102 ]

Rastogi V, Nath S. Differentially private aggregation of distributed time-series with transformation and encryption[C]//Proceedings of 2010 ACM SIGMOD International Conference on Management of Data. Indianapolis, Indiana, USA: ACM, 2010: 735-746. [ DOI:10.1145/1807167.1807247 http://dx.doi.org/10.1145/1807167.1807247 ]

Su S, Tang P, Cheng X, et al. Differentially private multi-party high-dimensional data publishing[C]//Proceedings of the 32nd International Conference on Data Engineering. Helsinki, Finland: IEEE, 2016: 205-216. [ DOI:10.1109/ICDE.2016.7498241 http://dx.doi.org/10.1109/ICDE.2016.7498241 ]

Xiao X K, Wang G Z, Gehrke J. Differential privacy via wavelet transforms[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(8):1200-1214. [DOI:10.1109/TKDE.2010.247]

Li C, Miklau G, Hay M, et al. The matrix mechanism:optimizing linear counting queries under differential privacy[J]. The VLDB Journal, 2015, 24(6):757-781. [DOI:10.1007/s00778-015-0398-x]

Yuan G Z, Zhang Z J, Winslett M, et al. Optimizing batch linear queries under exact and approximate differential privacy[J]. ACM Transactions on Database Systems, 2015, 40(2):#11. [DOI:10.1145/2699501]

Dwork C, McSherry F, Nissim K, et al. Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Theory of Cryptography Conference. New York, USA: Springer, 2006: 363-385. [ DOI:10.1007/11681878_14 http://dx.doi.org/10.1007/11681878_14 ]

Dwork C. Differential privacy[C]//Proceedings of the 33rd International Colloquium on Automata, Languages and Programming. Venice, Italy: Springer, 2006: 1-12. [ DOI:10.1007/11787006_1 http://dx.doi.org/10.1007/11787006_1 ]

Zhang X J, Shao C, Meng X F. Accurate histogram release under differential privacy[J]. Journal of Computer Research and Development, 2016, 53(5):1106-1117.

张啸剑, 邵超, 孟小峰.差分隐私下一种精确直方图发布方法[J].计算机研究与发展, 2016, 53(5):1106-1117. [DOI:10.7544/issn1000-1239.2016.20150304]

Dwork C, McSherry F, Nissim K, et al. Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Theory of Cryptography Conference. New York, NY, USA: Springer, 2006: 268-284. [ DOI:10.1007/11681878_14 http://dx.doi.org/10.1007/11681878_14 ]

McSherry F, Talwar K. Mechanism design via differential privacy[C]//The 48th Annual IEEE Symposium on Foundations of Computer Science. Providence, RI, USA: IEEE, 2007: 94-103. [ DOI:10.1109/FOCS.2007.41 http://dx.doi.org/10.1109/FOCS.2007.41 ]