Current Issue Cover
人工智能模型水印研究进展

吴汉舟1, 张杰2, 李越3, 殷赵霞4, 张新鹏1,5, 田晖3, 李斌6, 张卫明2, 俞能海2(1.上海大学通信与信息工程学院, 上海 200444;2.中国科学技术大学网络空间安全学院, 合肥 230027;3.华侨大学计算机科学与技术学院, 厦门 361021;4.华东师范大学通信与电子工程学院, 上海 200240;5.复旦大学计算机科学技术学院, 上海 200438;6.深圳大学电子与信息工程学院, 深圳 518060)

摘 要
以神经网络为代表的人工智能技术在计算机视觉、模式识别和自然语言处理等诸多应用领域取得了巨大的成功,包括谷歌、微软在内的许多科技公司都将人工智能模型部署在商业产品中,以提升服务质量和经济效益。然而,构建性能优异的人工智能模型需要消耗大量的数据、计算资源和专家知识,并且人工智能模型易于被未经授权的用户窃取、篡改和贩卖。在人工智能技术迅速发展的同时,如何保护人工智能模型的知识产权具有显著学术意义和产业需求。在此背景下,本文主要介绍基于数字水印的人工智能模型产权保护技术。通过与传统多媒体水印技术进行对比,首先概述了人工模型水印的研究意义、基础概念和评价指标;然后,依据水印提取者是否需要掌握目标模型的内容细节以及是否需要和目标模型进行交互,从“白盒”模型水印、“黑盒”模型水印、“无盒”模型水印3个不同的角度分别梳理了国内外研究现状并总结了不同方法的差异,与此同时,对脆弱模型水印也进行了分析和讨论;最后,通过对比不同方法的特点、优势和不足,总结了不同场景下模型水印的共性技术问题,并对发展趋势进行了展望。
关键词
Overview of artificial intelligence model watermarking

Wu Hanzhou1, Zhang Jie2, Li Yue3, Yin Zhaoxia4, Zhang Xinpeng1,5, Tian Hui3, Li Bin6, Zhang Weiming2, Yu Nenghai2(1.School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;2.School of Cyber Science and Technology, University of Science and Technology of China, Hefei 230027, China;3.School of Computer Science and Technology, Huaqiao University, Xiamen 361021, China;4.School of Communication and Electronic Engineering, East China Normal University, Shanghai 200240, China;5.School of Computer Science, Fudan University, Shanghai 200438, China;6.School of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, China)

Abstract
The deep neural networks(DNNs) -relevant artificial intelligence(AI) technique has been developing intensively in the context of such domains like computer vision, pattern analysis, natural language processing, bioinformatics, and games. Especially, AI models have been widely deployed in the cloud by technology companies to provide smart and personalized services. However, creating state-of-the-art AI models requires a lot of high-quality data, powerful computing resources and expert knowledge of the architecture design. Furthermore, AI models are threatened to be copied, tampered and redistributed in an unauthorized manner. It indicates that it is necessary to protect the AI models against intellectual property infringement, which yields researchers to concern about the intellectual property protection. Current techniques are concerned of digital watermarking for intellectual property protection of AI models, which is referred to AI model watermarking. The core of AI model watermarking is to embed a secret watermark revealing the ownership of the AI model to be protected into the AI model through an imperceptible way. However, unlike many multimedia watermarking methods that treat media data as a static signal, AI model watermarking is required to embed information into an AI model with a specific task. We cannot directly apply conventional multimedia watermarking methods to AI models since simply modifying a given AI model may significantly impair the performance of the AI model on its original task. It motivates people to design watermarking methods specifically for AI models. For embedding a watermark, the performance of the watermarked AI model on its original task should not be degraded significantly and the embedded watermark concealed in the watermarked AI model should be able to be extracted to identify the ownership of the AI model when disputes arise. Considering whether the watermark extractor should know the internal details of the target AI model or not, we can divide existing methods into two categories, i. e., white-box AI model watermarking and black-box AI model watermarking. For white-box AI model watermarking, the watermark extractor should know the internal details of the target watermarked AI model so that he can extract the embedded watermark from model parameters or model structures. For black-box AI model watermarking, the watermark extractor does not know the internal details of the target watermarked AI model, but he has the ability to query the prediction results of the target model in correspondence to a set of trigger samples. The trigger samples are carefully crafted. By checking whether the prediction results are consistent with the pre-specific labels of the trigger samples, the watermark extractor is capable of determining the ownership of the target watermarked AI model. A special case of black-box AI model watermarking is box-free AI model watermarking, in which the watermark extractor has no access to the target model. It means that the watermark extractor cannot know the internal details of the model and interact with the model. However, the watermark extractor can extract the watermark from any sample generated by the target model. The ownership can be verified via extracting the watermark from the output. In addition, fragile AI model watermarking has also been investigated recently. Unlike many methods that focus on robust verification, fragile model watermarking enables us to detect whether the target model was modified or not, thereby achieving integrity verification of the target model. To review the latest developments and trends, advanced methodologies in AI model watermarking are analyzed as mentioned below:1) the aims and objectives, basic concepts, evaluation metrics and technical classification of AI model watermarking are introduced. 2) Current development status of AI model watermarking is summarized and analyzed. 3) Such pros and cons are compared and analyzed as well. 4) The prospects for the development trend of AI model watermarking and the potentials of AI model watermarking in relevance to AI security are provided.
Keywords

订阅号|日报