大小模型端云协同进化技术进展
Advances in edge-cloud collaboration and evolution for large-small models
- 2024年29卷第6期 页码:1510-1534
纸质出版日期: 2024-06-16
DOI: 10.11834/jig.240011
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-06-16 ,
移动端阅览
王永威, 沈弢, 张圣宇, 吴帆, 赵洲, 蔡海滨, 吕承飞, 马利庄, 杨承磊, 吴飞. 2024. 大小模型端云协同进化技术进展. 中国图象图形学报, 29(06):1510-1534
Wang Yongwei, Shen Tao, Zhang Shengyu, Wu Fan, Zhao Zhou, Cai Haibin, Lyu Chengfei, Ma Lizhuang, Yang Chenglei, Wu Fei. 2024. Advances in edge-cloud collaboration and evolution for large-small models. Journal of Image and Graphics, 29(06):1510-1534
生成式基座大模型正在引发人工智能领域的重大变革,在自然语言处理、多模态理解与内容合成等任务展现通用能力。大模型部署于云侧提供通用智能服务,但面临时延大、个性化不足等关键挑战,小模型部署于端侧捕捉个性化场景数据,但存在泛化性不足的难题。大小模型端云协同技术旨在结合大模型通用能力和小模型专用能力,以协同交互方式学习演化进而赋能下游垂直行业场景。本文以大语言模型和多模态大模型为代表,梳理生成式基座大模型的主流架构、典型预训练技术和适配微调等方法,介绍在大模型背景下模型剪枝、模型量化和知识蒸馏等大模型小型化关键技术的发展历史和研究近况,依据模型间协作目的及协同原理异同,提出大小模型协同训练、协同推理和协同规划的协同进化分类方法,概述端云模型双向蒸馏、模块化设计和生成式智能体等系列代表性新技术、新思路。总体而言,本文从生成式基座大模型、大模型小型化技术和大小模型端云协同方式3个方面探讨大小模型协同进化的国际和国内发展现状,对比优势和差距,并从应用前景、模型架构设计、垂直领域模型融合、个性化和安全可信挑战等层面分析基座赋能发展趋势。
Generative foundation models are facilitating significant transformations in the field of artificial intelligence. They demonstrate general artificial intelligence in diverse research fields, including natural language processing, multimodal content understanding, imagery, and multimodal content synthesis. Generative foundation models often consist of billions or even hundreds of billions of parameters. Thus, they are often deployed on the cloud side to provide powerful and general intelligent services. However, this type of service can be confronted with crucial challenges in practice, such as high latency induced by communications between the cloud and local devices, and insufficient personalization capabilities due to the fact that servers often do not have access to local data considering privacy concerns. By contrast, low-complexity lightweight models are located at the edge side to capture personalized and dynamic scenario data. However, they may suffer from poor generalization. Large and lightweight (or large-small) model collaboration aims to integrate the general intelligence of large foundation models and the personalized intelligence of small lightweight models. This integration empowers downstream vertical domain-specific applications through the interaction and collaboration of both types of intelligent models. Large and small model collaboration has recently attracted increasing attention and becomes the focus of research and development in academia and industry. It has also been predicted to be an important trend in technology. We therefore try to thoroughly investigate this area by highlighting recent progress and bringing potential inspirations for related research. In this study, we first overview representative large language models (LLMs) and large multimodal models. We focus on their mainstream Transformer-based model architectures including encoder-only, decoder-only, and encoder-decoder models. Corresponding pre-training technologies such as next sentence prediction, sequence-to-sequence modeling, contrastive learning, and parameter-efficient fine-tuning methods with representatives including low-rank adaptation and prompt tuning are also explored. We then review the development history and the latest advancement of model compression techniques, including model pruning, model quantization, and knowledge distillation in the era of foundation models. Based on the differences in terms of model collaboration purposes and mechanisms, we propose a new classification method and taxonomies for the large-small model collaboration study, namely, collaborative training, collaborative inference, and collaborative planning. Specifically, we summarize recent and representative methods that consist of dual-directional knowledge distillation between large models at the cloud side and small models deployed at the edge side, modular design of intelligent models that split functional models between the cloud and edge, and generative agents that collaborate to complete more complex tasks in an autonomous and intelligent manner. In collaborative training, a main challenge is dealing with the heterogeneity in data distribution and model architectures between the cloud and client sides. Data privacy may also be a concern during collaborative training, particularly in privacy sensitive cases. Despite much progress in collaborative inference, slicing and completing a complicated task in a collective way automatically remain challenging. Furthermore, the communication costs between computing facilities might be another concern. Collective planning is a new paradigm that gains attention with the increasing study and promising progress of LLM-centric agents (LLM agents). This paradigm often involves multiple LLM agents who compete or cooperate together to complete a challenging task. It often leverages emerging capabilities such as in-context learning and chain-of-thoughts of LLMs to automatically dive a complicated task into several subtasks. By completing and assembling different subtasks, the global task can be conducted in a collaborative manner. This scheme finds diverse applications such as developing games and simulating social societies. However, it may suffer from drawbacks inherent in LLMs, including hallucination and adversarial vulnerabilities. Thus, more robust and reliable collaborative planning schemes remain to be investigated. In summary, this work surveys the large-small model collaboration techniques from the perspectives of generative foundation models, model compression, and heterogeneous model collaboration via LLM agents. This work also compares the advantages and disadvantages between international and domestic technology developments in this research realm. We conclude that, although the gaps are narrowing between domestic and advanced international studies in this area, particularly for newly emerging LLM agents, we may still lack original and major breakthroughs. Certain notable advantages of domestic progress are closely related to industrial applications due to its rich data resources from industries. Therefore, the development of domain specific LLMs is advanced. In addition, this study envisions the applications of large-small model collaboration and discusses certain key challenges and promising directions in this topic. 1) The design of efficient model architectures includes developing new model architectures that can achieve low-complexity inference speed while maintaining efficient long-sequence modeling abilities as Transformers and further improving the scalability of mixture-of-expert-based architectures. 2) Current model compression methods are mainly designed for vision models. Thus, developing techniques specifically for LLMs and large multimodal models is important to preserve their emergent abilities during compression. 3) Existing personalization methods specially focus on discriminative models, and due attention needs to be paid for efficient personalization for generative foundation models. 4) Generative intelligence often suffers from fraudulent contents (e.g., generated fake imagery, deepfake videos, and fake news) and different types of attacks (e.g., adversarial attacks, the jailing breaking attacks, and the Byzantine attacks). Thus, security and trustworthy issues arise in their practical applications. Therefore, this study also advocates a deeper investigation of these emerging security threats. Then, it develops effective defenses accordingly to countermeasure these crucial issues during large-small model collaboration for empowering vertical domains more safely.
生成式大模型大模型小型化大小模型协同进化端云协同进化生成式智能体生成式人工智能
generative foundation modelsmodel compressionlarge-small model collaborationedge-cloud collaborationgenerative agentsgenerative AI
Afonin A and Karimireddy S P. 2022. Towards model agnostic federated learning using knowledge distillation//Proceedings of the 10th International Conference on Learning Representations. San Diego, USA: ICLR: 1-23
Ahn S, Hu S X, Damianou A, Lawrence N D and Dai Z. 2019. Variational information distillation for knowledge transfer//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 9163-9171 [DOI: 10.1109/CVPR.2019.00938http://dx.doi.org/10.1109/CVPR.2019.00938]
Allen-Zhu Z and Li Y Z. 2020. Towards understanding ensemble, knowledge distillation and self-distillation in deep learning//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: OpenReview.net: 1-12
Asadi N and Goudarzi M. 2024. Variant parallelism: lightweight deep convolutional models for distributed inference on IoT devices. IEEE Internet of Things Journal, 11(1): 345-352 [DOI: 10.1109/JIOT.2023.3285877http://dx.doi.org/10.1109/JIOT.2023.3285877]
Banitalebi-Dehkordi A, Vedula N, Pei J, Xia F, Wang L J and Zhang Y. 2021. Auto-split: a general framework of collaborative edge-cloud AI//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Singapore, Singapore: ACM: 2543-2553 [DOI: 10.1145/3447548.3467078http://dx.doi.org/10.1145/3447548.3467078]
Banner R, Nahshan Y and Soudry D. 2019. Post training 4-bit quantization of convolutional networks for rapid-deployment//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #714
Bao G M and Guo P. 2022. Federated learning in cloud-edge collaborative architecture: key technologies, applications and challenges. Journal of Cloud Computing, 11(1): #94 [DOI: 10.1186/s13677-022-00377-4http://dx.doi.org/10.1186/s13677-022-00377-4]
Bhardwaj R, Xia Z X, Ananthanarayanan G, Jiang J C, Shu Y C, Karianakis N, Hsieh K, Bahl P and Stoica I. 2022. Ekya: continuous learning of video analytics models on edge compute servers//Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation. Renton, USA: USENIX Association: 119-135
BigScience Workshop. 2022. BLOOM: a 176B-parameter open-access multilingual language model [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2211.05100v1.pdfhttps://arxiv.org/pdf/2211.05100v1.pdf
Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J D, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D M, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I and Amodei D. 2020. Language models are few-shot learners//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #159
Bucilua C, Caruana R and Niculescu-Mizil A. 2006. Model compression//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, USA: ACM: 535-541 [DOI: 10.1145/1150402.1150464http://dx.doi.org/10.1145/1150402.1150464]
Chan C M, Chen W Z, Su Y S, Yu J X, Liu Z Y, Fu J, Xue W and Zhang S H. 2023. ChatEval: towards better LLM-based evaluators through multi-agent debate [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2308.07201.pdfhttps://arxiv.org/pdf/2308.07201.pdf
Chen D F, Mei J P, Wang C, Feng Y and Chen C. 2020. Online knowledge distillation with diverse peers//Proceedings of the 37th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 3430-3437 [DOI: 10.1609/aaai.v34i04.5746http://dx.doi.org/10.1609/aaai.v34i04.5746]
Chen D K, Wang H B, Huo Y H, Li Y Z and Zhang H Y. 2023a. GameGPT: multi-agent collaborative framework for game development [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.08067.pdfhttps://arxiv.org/pdf/2310.08067.pdf
Chen G Y, Dong S W, Shu Y, Zhang G, Sesay J, Karlsson B F, Fu J and Shi Y M. 2023b. AutoAgents: a framework for automatic agent generation [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2309.17288.pdfhttps://arxiv.org/pdf/2309.17288.pdf
Chen P G, Liu S, Zhao H S and Jia J Y. 2021a. Distilling knowledge via knowledge review//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 5006-5015 [DOI: 10.1109/CVPR46437.2021.00497http://dx.doi.org/10.1109/CVPR46437.2021.00497]
Chen T A, Yang D N and Chen M S. 2023c. Overcoming forgetting catastrophe in quantization-aware training//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 17312-17321 [DOI: 10.1109/ICCV51070.2023.01592http://dx.doi.org/10.1109/ICCV51070.2023.01592]
Chen Y K, Qian S J, Tang H T, Lai X, Liu Z J, Han S and Jia J Y. 2023d. LongLoRA: efficient fine-tuning of long-context large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2309.12307.pdfhttps://arxiv.org/pdf/2309.12307.pdf
Chen Z Y, Yao J C, Wang F, Jia K Y, Han B, Zhang W and Yang H X. 2021b. MC2-SF: slow-fast learning for mobile-cloud collaborative recommendation [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2109.12314.pdfhttps://arxiv.org/pdf/2109.12314.pdf
Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P, Chung H W, Sutton C, Gehrmann S, Schuh P, Shi K S, Tsvyashchenko S, Maynez J, Rao A, Barnes P, Tay Y, Shazeer N, Prabhakaran V, Reif E, Du N, Hutchinson B, Pope R, Bradbury J, Austin J, Isard M, Gur-Ari G, Yin P C, Duke T, Levskaya A, Ghemawat S, Dev S, Michalewski H, Garcia X, Misra V, Robinson K, Fedus L, Zhou D, Ippolito D, Luan D, Lim H, Zoph B, Spiridonov A, Sepassi R, Dohan D, Agrawal S, Omernick M, Dai A M, Pillai T S, Pellat M, Lewkowycz A, Moreira E, Child R, Polozov O, Lee K, Zhou Z W, Wang X Z, Saeta B, Diaz M, Firat O, Catasta M, Wei J, Meier-Hellstern K, Eck D, Dean J, Petrov S and Fiedel N. 2023. PaLM: scaling language modeling with pathways. Journal of Machine Learning Research, 24(240): 1-13
Daga H, Chen Y W, Agrawal A and Gavrilovska A. 2023. CLUE: systems support for knowledge transfer in collaborative learning with neural nets. IEEE Transactions on Cloud Computing, 11(4): 3541-3554 [DOI: 10.1109/TCC.2023.3294490http://dx.doi.org/10.1109/TCC.2023.3294490]
Dai X, Kong X N, Guo T and Huang Y X. 2021. CiNet: redesigning deep neural networks for efficient mobile-cloud collaborative inference//Proceedings of 2021 SIAM International Conference on Data Mining (SDM). Philadelphia, USA: SIAM: 459-467 [DOI: 10.1137/1.9781611976700.52http://dx.doi.org/10.1137/1.9781611976700.52]
Dehghani M, Djolonga J, Mustafa B, Padlewski P, Heek J, Gilmer J, Steiner A, Caron M, Geirhos R, Alabdulmohsin I, Jenatton R, Beyer L, Tschannen M, Arnab A, Wang X, Riquelme C, Minderer M, Puigcerver J, Evci U, Kumar M, Van Steenkiste S, Elsayed G F, Mahendran A, Yu F, Oliver A, Huot F, Bastings J, Collier M P, Gritsenko A A, Birodkar V, Vasconcelos C, Tay Y, Mensink T, Kolesnikov A, Pavetić F, Tran D, Kipf T, Lučić M, Zhai X H, Keysers D, Harmsen J and Houlsby N. 2023. Scaling vision Transformers to 22 billion parameters//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: JMLR.org: #296
Denton E L, Zaremba W, Bruna J, LeCun Y and Fergus R. 2014. Exploiting linear structure within convolutional networks for efficient evaluation//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press: 1269-1277
Dettmers T, Lewis M, Belkada Y and Zettlemoyer L. 2022. Llm.int8(): 8-bit matrix multiplication for Transformers at scale [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2208.07339.pdfhttps://arxiv.org/pdf/2208.07339.pdf
Devlin J, Chang M W, Lee K and Toutanova K. 2018. BERT: pre-training of deep bidirectional Transformers for language understanding//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, USA: ACL: 4171-4186 [DOI: 10.18653/v1/N19-1423http://dx.doi.org/10.18653/v1/N19-1423]
Diao E M, Ding J and Tarokh V. 2021. HeteroFL: computation and communication efficient federated learning for heterogeneous clients//Proceedings of the 9th International Conference on Learning Representations. San Diego, USA: OpenReview.net: 1-24
Ding C T, Zhou A, Liu Y X, Chang R N, Hsu C H and Wang S G. 2022a. A cloud-edge collaboration framework for cognitive service. IEEE Transactions on Cloud Computing, 10(3): 1489-1499 [DOI: 10.1109/TCC.2020.2997008http://dx.doi.org/10.1109/TCC.2020.2997008]
Ding M, Yang Z Y, Hong W Y, Zheng W D, Zhou C, Yin D, Lin J Y, Zou X, Shao Z, Yang H X and Tang J. 2021. CogView: mastering text-to-image generation via Transformers//Proceedings of the 35th Conference on Neural Information Processing Systems. Vancouver, Camada: OpenReview.net: 19822-19835
Ding N, Qin Y J, Yang G, Wei F C, Yang Z H, Su Y S, Hu S D, Chen Y L, Chan C M, Chen W Z, Yi J, Zhao W L, Wang X Z, Liu Z Y, Zheng H T, Chen J F, Liu Y, Tang J, Li J Z and Sun M S. 2022b. Delta tuning: a comprehensive study of parameter efficient methods for pre-trained language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2203.06904.pdfhttps://arxiv.org/pdf/2203.06904.pdf
Ding N, Qin Y J, Yang G, Wei F C, Yang Z H, Su Y S, Hu S D, Chen Y L, Chan C M, Chen W Z, Yi J, Zhao W L, Wang X Z, Liu Z Y, Zheng H T, Chen J F, Liu Y, Tang J, Li J Z and Sun M S. 2023. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5(3): 220-235 [DOI: 10.1038/s42256-023-00626-4http://dx.doi.org/10.1038/s42256-023-00626-4]
Dong Z Q, He Q, Chen F F, Jin H, Gu T and Yang Y. 2023. EdgeMove: pipelining device-edge model training for mobile intelligence//Proceedings of 2023 ACM Web Conference. New York, USA: ACM: 3142-3153 [DOI: 10.1145/3543507.3583540http://dx.doi.org/10.1145/3543507.3583540]
Du Y L, Li S, Torralba A, Tenenbaum J B and Mordatch I. 2023. Improving factuality and reasoning in language models through multiagent debate [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2305.14325.pdfhttps://arxiv.org/pdf/2305.14325.pdf
Du Z X, Qian Y J, Liu X, Ding M, Qiu J Z, Yang Z L and Tang J. 2022. GLM: general language model pretraining with autoregressive blank infilling//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland: ACL: 320-335 [DOI: 10.18653/v1/2022.acl-long.26http://dx.doi.org/10.18653/v1/2022.acl-long.26]
Fedus W, Zoph B and Shazeer N. 2022. Switch Transformers: Scaling to trillion parameter models with simple and efficient sparsity. The Journal of Machine Learning Research, 23(1): #120
Frantar E and Alistarh D. 2023. Sparsegpt: massive language models can be accurately pruned in one-shot//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: PMLR
Fu Y, Peng H, Khot T and Lapata M. 2023. Improving language model negotiation with self-play and in-context learning from AI feedback//Proceedings of the 37th Conference on Neural Information Processing Systems. New Orleans, USA: OpenReview.net: 1-11
Gong Y C, Liu L, Yang M and Bourdev L. 2015. Compressing deep convolutional networks using vector quantization [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/1412.6115.pdfhttps://arxiv.org/pdf/1412.6115.pdf
Gordon M, Duh K and Andrews N. 2020. Compressing BERT: studying the effects of weight pruning on transfer learning//Proceedings of the 5th Workshop on Representation Learning for NLP. Virtual: ACL: 143-155 [DOI: 10.18653/v1/2020.repl4nlp-1.18http://dx.doi.org/10.18653/v1/2020.repl4nlp-1.18]
Gou J P, Yu B S, Maybank S J and Tao D C. 2021. Knowledge distillation: a survey. International Journal of Computer Vision, 129(6): 1789-1819 [DOI: 10.1007/s11263-021-01453-zhttp://dx.doi.org/10.1007/s11263-021-01453-z]
Gouissem A, Abualsaud K, Yaacoub E, Khattab T and Guizani M. 2023. Collaborative byzantine resilient federated learning. IEEE Internet of Things Journal, 10(18): 15887-15899 [DOI: 10.1109/JIOT.2023.3266347http://dx.doi.org/10.1109/JIOT.2023.3266347]
Gu G X, Meng X J, Lu G S, Hou L, Niu M Z, Liang X D, Yao L W, Huang R H, Zhang W, Jiang X, Xu C J and Xu H. 2022. Wukong: a 100 million large-scale Chinese cross-modal pre-training benchmark//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: MIT Press: 26418-26431
Gu Y X, Dong L, Wei F R and Huang M L. 2023. Knowledge distillation of large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2306.08543.pdfhttps://arxiv.org/pdf/2306.08543.pdf
Hamilton S. 2023. Blind Judgement: agent-based supreme court modelling with GPT//The AAAI-23 Workshop on Creative AI Across Modalities. Washington, USA: OpenReview.net: 1-6
Han S, Mao H Z and Dally W J. 2016. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: OpenReview.net: #149
Han S, Pool J, Tran J and Dally W. 2015. Learning both weights and connections for efficient neural network//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montréal, Canada: MIT Press: 1135-1143
Hao R, Hu L M, Qi W J, Wu Q L, Zhang Y R and Nie L Q. 2023. ChatLLM network: more brains, more intelligence [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2304.12998.pdfhttps://arxiv.org/pdf/2304.12998.pdf
He R D, Liu L L, Ye H, Tan Q Y, Ding B S, Cheng L Y, Low J W, Bing L D and Si L. 2021. On the effectiveness of adapter-based tuning for pretrained language model adaptation//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. [s.l.]: Association for Computational Linguistics: 2208-2222 [DOI: 10.18653/v1/2021.acl-long.172http://dx.doi.org/10.18653/v1/2021.acl-long.172]
He Y H, Zhang X Y and Sun J. 2017. Channel pruning for accelerating very deep neural networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1398-1406 [DOI: 10.1109/ICCV.2017.155http://dx.doi.org/10.1109/ICCV.2017.155]
Hinton G, Vinyals O and Dean J. 2015. Distilling the knowledge in a neural network [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/1503.02531.pdfhttps://arxiv.org/pdf/1503.02531.pdf
Ho J, Jain A and Abbeel P. 2020. Denoising diffusion probabilistic models//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #574
Ho N, Schmid L and Yun S Y. 2023. Large language models are reasoning teachers//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Toronto, Canada: ACL: 14852-14882 [DOI: 10.18653/v1/2023.acl-long.830http://dx.doi.org/10.18653/v1/2023.acl-long.830]
Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, De Laroussilhe Q, Gesmundo A, Attariyan M and Gelly S. 2019. Parameter-efficient transfer learning for NLP//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: ICML: 2790-2799
Hu E J, Shen Y L, Wallis P, Allen-Zhu Z Y, Li Y Z, Wang S A, Wang L and Chen W Z. 2022. Lora: low-rank adaptation of large language models//Proceedings of the 10th International Conference on Learning Representations. Virtual: OpenReview.net: 1-26
Huang C S, Liu Q, Lin B Y, Du C, Pang T Y and Lin M. 2023. LoraHub: efficient cross-task generalization via dynamic lora composition//Proceedings of the 12th International Conference on Learning Representations. OpenReview.net: 1-20
Huang Y K, Chen Y D, Yu Z and McKeown K. 2022. In-context learning distillation: transferring few-shot learning ability of pre-trained language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2212.10670.pdfhttps://arxiv.org/pdf/2212.10670.pdf
InternLM Team. 2023. InternLM: a multilingual language model with progressively enhanced capabilities [EB/OL]. [2023-12-31]. https://github.com/InternLM/InternLM-techreport/blob/main/InternLM.pdfhttps://github.com/InternLM/InternLM-techreport/blob/main/InternLM.pdf
Jacob B, Kligys S, Chen B, Zhu M L, Tang M, Howard A, Adam H and Kalenichenko D. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2704-2713 [DOI: 10.1109/CVPR.2018.00286http://dx.doi.org/10.1109/CVPR.2018.00286]
Jacobs R A, Jordan M I, Nowlan S J and Hinton G E. 1991. Adaptive mixtures of local experts. Neural Computation, 3(1): 79-87 [DOI: 10.1162/neco.1991.3.1.79http://dx.doi.org/10.1162/neco.1991.3.1.79]
Jain N, Schwarzschild A, Wen Y X, Somepalli G, Kirchenbauer J, Chiang P Y, Goldblum M, Saha A, Geiping J and Goldstein T. 2023. Baseline defenses for adversarial attacks against aligned language models//Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview.net: 1-22
Ji M, Heo B and Park S. 2021. Show, attend and distill: knowledge distillation via attention-based feature matching//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI: 7945-7952 [DOI: 10.1609/aaai.v35i9.16969http://dx.doi.org/10.1609/aaai.v35i9.16969]
Jiang P H, Xin K, Li C X and Zhou Y S. 2023. High-efficiency device-cloud collaborative Transformer model//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 2204-2210 [DOI: 10.1109/cvprw59228.2023.00214http://dx.doi.org/10.1109/cvprw59228.2023.00214]
Ko J H, Na T, Amir M F, Mukhopadhyay S. 2018. Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms//Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. Auckland, New Zealand: IEEE: 1-6 [DOI: 10.1109/AVSS.2018.8639121http://dx.doi.org/10.1109/AVSS.2018.8639121]
Lan Z Z, Chen M D, Goodman S, Gimpel K, Sharma P and Soricut R. 2019. ALBERT: a lite BERT for self-supervised learning of language representations//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: OpenReview.net: 1-17
Lepikhin D, Lee H, Xu Y Z, Chen D H, Firat O, Huang Y P, Krikun M, Shazeer N and Chen Z F. 2020. GShard: scaling giant models with conditional computation and automatic sharding//Proceedings of 2020 International Conference on Learning Representations (ICLR). Addis Ababa, Ethiopia: ICLR: 1-35 [DOI: 10.18653/v1/2020.iclr-1.1].
Lester B, Al-Rfou R and Constant N. 2021. The power of scale for parameter-efficient prompt tuning//Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican Republic: ACL: 3045-3059 [DOI: 10.18653/v1/2021.emnlp-main.243http://dx.doi.org/10.18653/v1/2021.emnlp-main.243]
Lewis M, Liu Y H, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V and Zettlemoyer L. 2020. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [s.l.]: ACL: 7871-7880 [DOI: 10.18653/v1/2020.acl-main.703http://dx.doi.org/10.18653/v1/2020.acl-main.703]
Liao Z, Quétu V, Nguyen VT and Tartaglione E. 2023. Can Unstructured pruning reduce the depth in deep neural networks?//Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE/CVF: 402-1406[DOI: 10.1109/ICCVW60793.2023.00151http://dx.doi.org/10.1109/ICCVW60793.2023.00151]
Li D L and Wang J P. 2019. FedMD: heterogenous federated learning via model distillation [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/1910.03581.pdfhttps://arxiv.org/pdf/1910.03581.pdf
Li G H, Hammoud H A A K, Itani H, Khizbullin D and Ghanem B. 2023a. Camel: communicative agents for “mind” exploration of large language model society//Proceedings of the 37th Conference on Neural Information Processing Systems. New Orleans, USA: OpenReview.net: 1-18
Li H, Kadav A, Durdanovic I, Samet H and Graf H P. 2017. Pruning filters for efficient ConvNets//Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR: 1-13
Li H, Zhu J G, Jiang X H, Zhu X Z, Li H S, Yuan C, Wang X H, Qiao Y, Wang X G, Wang W H and Dai J F. 2023b. Uni-perceiver v2: a generalist model for large-scale vision and vision-language tasks//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 2691-2700 [DOI: 10.1109/CVPR52729.2023.00264http://dx.doi.org/10.1109/CVPR52729.2023.00264]
Li H S, Hu C H, Jiang J Y, Wang Z, Wen Y G and Zhu W W. 2018. JALAD: joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution//Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems. Singapore, Singapore: IEEE: 671-678 [DOI: 10.1109/PADSW.2018.8645013http://dx.doi.org/10.1109/PADSW.2018.8645013]
Li J N, Li D X, Savarese S and Hoi S. 2023c. BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: JMLR.org: #814
Li J N, Li D X, Xiong C M and Hoi S C H. 2022a. BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation//Proceedings of the 39th International Conference on Machine Learning. Baltimore, USA: PMLR: 12888-12900
Li S Y, Chen J S, Shen Y L, Chen Z Y, Zhang X L, Li Z K, Wang H, Qian J, Peng B L, Mao Y, Chen W H and Yan X F. 2022b. Explanations from large language models make small reasoners better [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2210.06726.pdfhttps://arxiv.org/pdf/2210.06726.pdf
Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A and Smith V. 2020. Federated optimization in heterogeneous networks//Proceedings of Machine Learning and Systems 2020. Austin, USA: mlsys.org, 2020: 429-450
Li X L and Liang P. 2021. Prefix-tuning: optimizing continuous prompts for generation//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. ACL: 4582-4597 [DOI: 10.18653/v1/2021.acl-long.353http://dx.doi.org/10.18653/v1/2021.acl-long.353]
Li Y, Zhang Y X and Sun L C. 2023e. MetaAgents: simulating interactions of human behaviors for LLM-based task-oriented coordination via collaborative generative agents [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.06500.pdfhttps://arxiv.org/pdf/2310.06500.pdf
Li Y W, Adamczewski K, Li W, Gu S H, Timofte R and Van Gool L. 2022c. Revisiting random channel pruning for neural network compression//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 191-201 [DOI: 10.1109/CVPR52688.2022.00029http://dx.doi.org/10.1109/CVPR52688.2022.00029]
Li Y X, Yu Y F, Liang C, He P C, Karampatziakis N, Chen W Z and Zhao T. 2023d. LoftQ: LoRA-fine-tuning-aware quantization for large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.08659.pdfhttps://arxiv.org/pdf/2310.08659.pdf
Li Z, Li X, Yang L F, Zhao B R, Song R J, Luo L, Li J and Yang J. 2023h. Curriculum temperature for knowledge distillation//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 1504-1512 [DOI: 10.1609/aaai.v37i2.25236http://dx.doi.org/10.1609/aaai.v37i2.25236]
Li Z K, Xiao J R, Yang L W and Gu Q Y. 2023f. RepQ-ViT: scale reparameterization for post-training quantization of vision Transformers//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 17181-17190 [DOI: 10.1109/ICCV51070.2023.01580http://dx.doi.org/10.1109/ICCV51070.2023.01580]
Li Z X, Li Q W, Zhou Y, Zhong W L, Zhang G N and Wu C. 2023g. Edge-cloud collaborative learning with federated and centralized features//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. Taipei, China: ACM: 1949-1953 [DOI: 10.1145/3539618.3591976http://dx.doi.org/10.1145/3539618.3591976]
Liang T, He Z W, Jiao W X, Wang X, Wang Y, Wang R, Yang Y J, Tu Z P and Shi S M. 2023. Encouraging divergent thinking in large language models through multi-agent debate [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2305.19118.pdfhttps://arxiv.org/pdf/2305.19118.pdf
Lin B Y, Fu Y C, Yang K, Ammanabrolu P, Brahman F, Huang S Y, Bhagavatula C, Choi Y and Ren X. 2023. SwiftSage: a generative agent with fast and slow thinking for complex interactive tasks//37th Interactive Learning with Implicit Human Feedback Workshop at ICML 2023. New Orleans, USA: OpenReview.net: 1-18
Lin J Y, Men R, Yang A, Zhou C, Ding M, Zhang Y C, Wang P, Wang A, Jiang L, Jia X Y, Zhang J, Zhang J W, Zou X, Li Z K, Deng X D, Liu J, Xue J B, Zhou H L, Ma J X, Yu J, Li Y, Lin W, Zhou J R, Tang J and Yang H X. 2021. M6: a Chinese multimodal pretrainer [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2103.00823.pdfhttps://arxiv.org/pdf/2103.00823.pdf
Liu J, Zhuang B H, Zhuang Z W, Guo Y, Huang J Z, Zhu J H and Tan M K. 2022a. Discrimination-aware network pruning for deep model compression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(8): 4035-4051 [DOI: 10.1109/TPAMI.2021.3066410http://dx.doi.org/10.1109/TPAMI.2021.3066410]
Liu J W, Niu L, Yuan Z H, Yang D W, Wang X G and Liu W Y. 2023a. PD-Quant: post-training quantization based on prediction difference metric//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 24427-24437 [DOI: 10.1109/CVPR52729.2023.02340http://dx.doi.org/10.1109/CVPR52729.2023.02340]
Liu X, Ji K X, Fu Y C, Tam W L, Du Z X, Yang Z L and Tang J. 2022c. P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Dublin, Ireland: ACL: 61-68 [DOI: 10.18653/v1/2022.acl-short.8http://dx.doi.org/10.18653/v1/2022.acl-short.8]
Liu X, Zheng Y N, Du Z X, Ding M, Qian Y J, Yang Z L and Tang J. 2021. GPT understands, too [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2103.10385.pdfhttps://arxiv.org/pdf/2103.10385.pdf
Liu Y H, Ott M, Goyal N, Du J F, Joshi M, Chen D Q, Levy O, Lewis M, Zettlemoyer L and Stoyanov V. 2019. RoBERTa: a robustly optimized BERT pretraining approach [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/1907.11692.pdfhttps://arxiv.org/pdf/1907.11692.pdf
Liu Z C, Oguz B, Zhao C S, Chang E, Stock P, Mehdad Y, Shi Y Y, Krishnamoorthi R and Chandra V. 2023b. LLM-QAT: data-free quantization aware training for large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2305.17888.pdfhttps://arxiv.org/pdf/2305.17888.pdf
Liu Z J, Zhang Y Z, Li P, Liu Y and Yang D Y. 2024. Dynamic LLM-agent network: an LLM-agent collaboration framework with agent team optimization//Proceedings of the 12th International Conference on Learning Representations [s.l.]: OpenReview.net: 1-22
Lu Y, Shu Y C, Tan X, Liu Y X, Zhou M Y, Chen Q and Pei D. 2019a. Collaborative learning between cloud and end devices: an empirical study on location prediction//Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. Arlington Virginia, USA: Association for Computing Machinery: 139-151 [DOI: 10.1145/3318216.3363304http://dx.doi.org/10.1145/3318216.3363304]
Lu J S, Batra D, Parikh D and Lee S. 2019b. ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.2
Luo J H, Wu J X and Lin W Y. 2017. ThiNet: a filter level pruning method for deep neural network compression//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5068-5076 [DOI: 10.1109/ICCV.2017.541http://dx.doi.org/10.1109/ICCV.2017.541]
Lyu C F, Niu C Y, Gu R J, Jiang X T, Wang Z D, Liu B, Wu Z Q, Yao Q L, Huang C Y, Huang P, Huang T, Shu H, Song J D, Zou B, Lan P, Xu G H, Wu F, Tang S J, Wu F and Chen G H. 2022. Walle: an end-to-end, general-purpose, and large-scale production system for device-cloud collaborative machine learning//Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation. Carlsbad, USA: USENIX Association: 1-22
Lyu Z Q, Zhang W Q, Zhang S Y, Kuang K, Wang F, Wang Y W, Chen Z Y, Shen T, Yang H X, Ooi B C and Wu F. 2023. DUET: a tuning-free device-cloud collaborative parameters generation framework for efficient device model generalization//Proceedings of 2023 ACM Web Conference. Austin, USA: ACM: 3077-3085 [DOI: 10.1145/3543507.3583451http://dx.doi.org/10.1145/3543507.3583451]
Ma X Y, Fang G F and Wang X C. 2023a. LLM-Pruner: on the structural pruning of large language models//Proceedings of the 37th Conference on Neural Information Processing Systems. New Orleans, USA: OpenReview.net: 1-19
Ma X Y, Jeong S, Zhang M J, Wang D, Choi J and Jeon M. 2023b. Cost-effective on-device continual learning over memory hierarchy with Miro//Proceedings of the 29th Annual International Conference on Mobile Computing and Networking. Madrid, Spain: ACM: #83 [DOI: 10.1145/3570361.3613297http://dx.doi.org/10.1145/3570361.3613297]
Madan K, Ke R N, Goyal A, Schölkopf B B and Bengio Y. 2021. Fast and slow learning of recurrent independent mechanisms//Proceedings of the 9th International Conference on Learning Representations. [s.l.]: OpenReview.net: 1-17
Manakul P, Liusie A and Gales M J F. 2023. SelfCheckGPT: zero-resource black-box hallucination detection for generative large language models//Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. Singapore, Singapore: ACL: 9004-9017 [DOI: 10.18653/v1/2023.emnlp-main.557http://dx.doi.org/10.18653/v1/2023.emnlp-main.557]
McMahan B, Moore E, Ramage D, Hampson S and Arcas B A Y. 2017. Communication-efficient learning of deep networks from decentralized data//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA: PMLR: 1273-1282
Mitchell E, Lee Y, Khazatsky A, Manning C D and Finn C. 2023. DetectGPT: zero-shot machine-generated text detection using probability curvature//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: JMLR.org: #1038
Nair V, Schumacher E, Tso G and Kannan A. 2023. DERA: enhancing large language model completions with dialog-enabled resolving agents [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2303.17071.pdfhttps://arxiv.org/pdf/2303.17071.pdf
Nan Y, Jiang S Q and Li M. 2024. Large-scale video analytics with cloud-edge collaborative continuous learning. ACM Transactions on Sensor Networks, 20(1): #14 [DOI: 10.1145/3624478http://dx.doi.org/10.1145/3624478]
Narayanan D, Shoeybi M, Casper J, LeGresley P, Patwary M, Korthikanti V, Vainbrand D, Kashinkunti P, Bernauer J, Catanzaro B, Phanishayee A and Zaharia M. 2021. Efficient large-scale language model training on GPU clusters using megatron-LM//SC21: International Conference for High Performance Computing, Networking, Storage and Analysis. St. Louis, USA: IEEE: #58 [DOI: 10.1145/3458817.3476209http://dx.doi.org/10.1145/3458817.3476209]
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C L, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J, Hilton J, Kelton F, Miler L, Simens M, Askell A, Welinder P, l Christiano P, Leike J and Lowe R. 2022. Training language models to follow instructions with human feedback [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2203.02155.pdfhttps://arxiv.org/pdf/2203.02155.pdf
Niu C Y, Wu F, Tang S J, Hua L F, Jia R F, Lyu C F, Wu Z H and Chen G H. 2020. Billion-scale federated learning on mobile clients: a submodel design with tunable privacy//Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. London, UK: ACM: #31 [DOI: 10.1145/3372224.3419188http://dx.doi.org/10.1145/3372224.3419188]
Pacheco R G, Couto R S and Simeone O. 2021. Calibration-aided edge inference offloading via adaptive model partitioning of deep neural networks//Proceedings of 2021 IEEE International Conference on Communications. Montreal, Canada: IEEE: 1-6 [DOI: 10.1109/ICC42927.2021.9500760http://dx.doi.org/10.1109/ICC42927.2021.9500760]
Padmanabhan A, Iyer A P, Ananthanarayanan G, Shu Y C, Karianakis N, Xu G H and Netravali R. 2021. Towards memory-efficient inference in edge video analytics//Proceedings of the 3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges. New York, USA: ACM: 31-37 [DOI: 10.1145/3477083.3480150http://dx.doi.org/10.1145/3477083.3480150]
Park J, Min B, Ma X J and Kim J. 2023. ChoiceMates: supporting unfamiliar online decision-making with multi-agent conversational interactions [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.01331.pdfhttps://arxiv.org/pdf/2310.01331.pdf
Park W, Kim D, Lu Y and Cho M. 2019. Relational knowledge distillation//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,USA:IEEE: 3967-3976 [DOI: 10.1109/CVPR.2019.00409http://dx.doi.org/10.1109/CVPR.2019.00409]
Passalis N and Tefas A. 2018. Learning deep representations with probabilistic knowledge transfer//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 283-299 [DOI: 10.1007/978-3-030-01252-6_17http://dx.doi.org/10.1007/978-3-030-01252-6_17]
Pham C, Liu B Y, Yang Y X, Chen Z Y, Liu T Y, Yuan J B, Plummer B A, Wang Z R and Yang H X. 2023. Let models speak ciphers: multiagent debate through embeddings [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.06272.pdfhttps://arxiv.org/pdf/2310.06272.pdf
Qi X Y, Huang K X, Panda A, Wang M D and Mittal P. 2023. Visual adversarial examples jailbreak aligned large language models//The 2nd Workshop on New Frontiers in Adversarial Machine Learning. [s.l.]: OpenReview.net:
Radford A, Kim J W, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G and Sutskever I. 2021. Learning transferable visual models from natural language supervision//Proceedings of the 38th International Conference on Machine Learning. Online: PMLR: 8748-8763
Radford A, Narasimhan K, Salimans T and Sutskever I. 2018. Improving language understanding by generative pre-training [EB/OL]. [2023-12-31]. https://www.mikecaptain.com/resources/pdf/GPT-1.pdfhttps://www.mikecaptain.com/resources/pdf/GPT-1.pdf
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y Q, Li W and Liu P J. 2020. Exploring the limits of transfer learning with a unified text-to-text Transformer. The Journal of Machine Learning Research, 21(1): #140
Ramesh A, Pavlov M, Goh G, Gray S, Voss C, Radford A, Chen M and Sutskever I. 2021. Zero-shot text-to-image generation//Proceedings of the 38th International Conference on Machine Learning. Online: ICML: 8821-8831
Rawte V, Sheth A and Das A. 2023. A survey of hallucination in large foundation models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2303.08896.pdfhttps://arxiv.org/pdf/2303.08896.pdf
Rombach R, Blattmann A, Lorenz D, Esser P and Ommer B. 2022. High-resolution image synthesis with latent diffusion models//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 10674-10685 [DOI: 10.1109/CVPR52688.2022.01042http://dx.doi.org/10.1109/CVPR52688.2022.01042]
Romero A, Ballas N, Kahou S E, Chassang A, Gatta C and Bengio Y. 2015. FitNets: hints for thin deep nets//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR: 1-14
Ruiz N, Li Y Z, Jampani V, Wei W, Hou T B, Pritch Y, Wadhwa N, Rubinstein M and Aberman K. 2023. Hyperdreambooth: hypernetworks for fast personalization of text-to-image models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2307.06949.pdfhttps://arxiv.org/pdf/2307.06949.pdf
Saharia C, Chan W, Saxena S, Li L L, Whang J, Denton E L, Ghasemipour S K S, Gontijo-Lopes R, Ayan B K, Salimans T, Ho J, Fleet D J and Norouzi M. 2022. Photorealistic text-to-image diffusion models with deep language understanding//Proceedings of the 36th Conference on Neural Information Processing Systems. New Orleans, USA: OpenReview.net: 1-16
Shao S J, Shao C Z, Zhong C, Guo S Y and Lu P C. 2022. Cloud-edge collaboration based power IoT scene perception mechanism//Proceedings of the 11th International Conference on Game Theory for Networks. Virtual: Springer: 100-117 [DOI: 10.1007/978-3-031-23141-4_8http://dx.doi.org/10.1007/978-3-031-23141-4_8]
Shazeer N, Mirhoseini A, Maziarz K, Davis A, Le Q, Hinton G and Dean J. 2017. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer//Proceedings of the 5th International Conference on Learning Representations. Toulon, France: OpenReview.net: 1-19
Stock P, Fan A, Graham B, Grave E, Gribonval R, Jegou H and Joulin A. 2022. Training with quantization noise for extreme model compression//Proceedings of the 9th International Conference on Learning Representations. San Diego: OpenReview.net: 19123-19138
Su W J, Zhu X Z, Cao Y, Li B, Lu L W, Wei F R and Dai J F. 2020. VL-BERT: pre-training of generic visual-linguistic representations//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: OpenReview.net
Sui C H, Wang A, Zhou S W, Zang A K, Pan Y H, Liu H and Wang H P. 2023. A survey on adversarial training for robust learning. Journal of Image and Graphics, 28(12): 3629-3650
隋晨红, 王奥, 周圣文, 臧安康, 潘云豪, 刘颢, 王海鹏. 2023. 面向鲁棒学习的对抗训练技术综述. 中国图象图形学报, 28(12): 3629-3650 [DOI: 10.11834/jig.220953http://dx.doi.org/10.11834/jig.220953]
Sun C, Myers A, Vondrick C, Murphy K and Schmid C. 2019. VideoBERT: a joint model for video and language representation learning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 7463-7472 [DOI: 10.1109/ICCV.2019.00756http://dx.doi.org/10.1109/ICCV.2019.00756]
Sun Y, Wang S H, Feng S K, Ding S Y, Pang C, Shang J Y, Liu J X, Chen X Y, Zhao Y B, Lu Y X, Liu W X, Wu Z H, Gong W B, Liang J Z, Shang Z Z, Sun P, Liu W, Yang X O, Yu D H, Tian H, W H and Wang H F. 2021. ERNIE 3.0: large-scale knowledge enhanced pre-training for language understanding and generation [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2107.02137.pdfhttps://arxiv.org/pdf/2107.02137.pdf
Sun Y T, Dong L, Huang S H, Ma S M, Xia Y Q, Xue J L, Wang J Y and Wei F R. 2024. Retentive network: a successor to Transformer for large language models//Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview.net: 1-14
Sung Y L, Yoon J H and Bansal M. 2023. ECoFLaP: Efficient coarse-to-fine layer-wise pruning for vision-language models[EB/OL].[2023-12-31]. https://arxiv.org/pdf/2310.02998.pdfhttps://arxiv.org/pdf/2310.02998.pdf
Tao M, Bao B K, Tang H and Xu C S. 2023. GALIP: generative adversarial CLIPs for text-to-image synthesis//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 14214-14223 [DOI: 10.1109/CVPR52729.2023.01366http://dx.doi.org/10.1109/CVPR52729.2023.01366]
Tian Y, Yang X, Zhang J Y, Dong Y P and Su H. 2023. Evil geniuses: delving into the safety of LLM-based agents [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2311.11855.pdfhttps://arxiv.org/pdf/2311.11855.pdf
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E and Lample G. 2023. Llama: open and efficient foundation language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2302.13971.pdfhttps://arxiv.org/pdf/2302.13971.pdf
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6000-6010
Wang Y L, Zhang X L, Xie L X, Zhou J, Su H, Zhang B and Hu X L. 2020. Pruning from scratch//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 12273-12280 [DOI: 10.1609/aaai.v34i07.6910http://dx.doi.org/10.1609/aaai.v34i07.6910]
Wang Y W, Ding X, Yang Y X, Ding L, Ward R and Wang Z J. 2021. Perception matters: exploring imperceptible and transferable anti-forensics for GAN-generated fake face imagery detection. Pattern Recognition Letters, 146: 15-22 [DOI: 10.1016/j.patrec.2021.03.009http://dx.doi.org/10.1016/j.patrec.2021.03.009]
Wang Y W, Liu Y and Shen Z Q. 2023a. Revisiting item promotion in GNN-based collaborative filtering: a masked targeted topological attack perspective//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 15206-15214 [DOI: 10.1609/aaai.v37i12.26774http://dx.doi.org/10.1609/aaai.v37i12.26774]
Wang Y W, Wang Y H, Cai J Y, Lee T K, Miao C Y and Wang Z J. 2023b. SSD-KD: a self-supervised diverse knowledge distillation method for lightweight skin lesion classification using dermoscopic images. Medical Image Analysis, 84: #102693 [DOI: 10.1016/j.media.2022.102693http://dx.doi.org/10.1016/j.media.2022.102693]
Wang Z H L, Mao S G, Wu W S, Ge T, Wei F R and Ji H. 2023c. Unleashing the Emergent cognitive synergy in large language models: a task-solving agent through multi-persona self-collaboration [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2307.05300.pdfhttps://arxiv.org/pdf/2307.05300.pdf
Wei H L, Zhang H, Ai-Haddad K and Shi Y. 2023a. Ensuring secure platooning of constrained Intelligent and connected vehicles against Byzantine attacks: a distributed MPC framework. Engineering: #007 [DOI: 10.1016/j.eng.2023.10.007http://dx.doi.org/10.1016/j.eng.2023.10.007]
Wei Z P, Chen J J, Wu Z X and Jiang Y G. 2023b. Adaptive cross-modal transferable adversarial attacks from images to videos. IEEE Transactions on Pattern Analysis and Machine Intelligence: #3347835 [DOI: 10.1109/TPAMI.2023.3347835].
Wu C F, Liang J, Ji L, Yang F, Fang Y J, Jiang D X and Duan N. 2022. NÜWA: visual synthesis pre-training for neural visual world creation//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 720-736 [DOI: 10.1007/978-3-031-19787-1_41http://dx.doi.org/10.1007/978-3-031-19787-1_41]
Wu H Z, Zhang J, Li Y, Yin Z X, Zhang X P, Tian H, Li B, Zhang W M and Yu N H. 2023. Overview of artificial intelligence model watermarking. Journal of Image and Graphics, 28(6): 1792-1810
吴汉舟, 张杰, 李越, 殷赵霞, 张新鹏, 田晖, 李斌, 张卫明, 俞能海. 2023. 人工智能模型水印研究进展. 中国图象图形学报, 28(6): 1792-1810 [DOI: 10.11834/jig.230010http://dx.doi.org/10.11834/jig.230010]
Wu Q Y, Bansal G G, Zhang J Y, Wu Y R, Li B B, Zhu E K, Jiang L, Zhang X Y, Zhang S K, Liu J L, Awadallah A H, White R W, Burger D and Wang C. 2023. AutoGen: enabling next-gen LLM applications via multi-agent conversation//Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview.net: 1-43
Xiao G X, Lin J, Seznec M, Wu H, Demouth J and Han S. 2023. SmoothQuant: accurate and efficient post-training quantization for large language models//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: ICML: 38087-38099
Xie Y Q, Yi J W, Shao J W, Curl J, Lyu L J, Chen Q F, Xie X and Wu F Z. 2023. Defending ChatGPT against jailbreak attack via self-reminders. Nature Machine Intelligence, 5(12): 1486-1496 [DOI: 10.1038/s42256-023-00765-8http://dx.doi.org/10.1038/s42256-023-00765-8]
Xiong K, Ding X, Cao Y X, Liu T and Qin B. 2023. Examining inter-consistency of large language models collaboration: an in-depth analysis via debate//Findings of the Association for Computational Linguistics: EMNLP 2023. Singapore, Singapore: ACL: #508 [DOI: 10.18653/v1/2023.findings-emnlp.508http://dx.doi.org/10.18653/v1/2023.findings-emnlp.508]
Xu G D, Liu Z W, Li X X and Loy C C. 2020a. Knowledge distillation meets self-supervision//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 588-604 [DOI: 10.1007/978-3-030-58545-7_34http://dx.doi.org/10.1007/978-3-030-58545-7_34]
Xu S K, Li H K, Zhuang B H, Liu J, Cao J Z, Liang C R and Tan M K. 2020b. Generative low-bitwidth data free quantization//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 1-17 [DOI: 10.1007/978-3-030-58610-2_1http://dx.doi.org/10.1007/978-3-030-58610-2_1]
Xu Z C, Zhao L Q, Liang W F, Rana O F, Zhou P, Xia Q F, Xu W Z and Wu G W. 2021. Energy-aware inference offloading for DNN-driven applications in mobile edge clouds. IEEE Transactions on Parallel and Distributed Systems, 32(4): 799-814 [DOI: 10.1109/TPDS.2020.3032443http://dx.doi.org/10.1109/TPDS.2020.3032443]
Yan Y K, Niu C Y, Gu R J, Wu F, Tang S J, Hua L F, Lyu C F and Chen G H. 2022. On-device learning for model personalization with large-scale cloud-coordinated domain adaption//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Washington, USA: ACM: 2180-2190 [DOI: 10.1145/3534678.3539263http://dx.doi.org/10.1145/3534678.3539263]
Yang A Y, Xiao B, Wang B G, Zhang B R, Bian C, Yin C, Lyu C X, Pan D, Wang D, Yan D, Yang F, Deng F, Wang F, Liu F, Ai G W, Dong G S, Zhao H Z, Xu H, Sun H Z, Zhang H D, Liu H, Ji J M, Xie J, Dai J T, Fang K, Su L, Song L, Liu L F, Ru L Y, o Ma L Y, Wang M, Liu M, Lin M A, Nie N L, Guo P D, Sun R Y, Zhang T, Li T P, Li T Y, Cheng W, Chen W P, Zeng X R, Wang X C, Chen X X, Men X, Yu X, Pan X H, Shen Y J, Wang Y D, Li Y Y, Jiang Y X, Gao Y C, Zhang Y P, Zhou Z N and Wu Z Y. 2023. Baichuan 2: open large-scale language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2309.10305v1.pdfhttps://arxiv.org/pdf/2309.10305v1.pdf
Yang J W, Shen X, Xing J, Tian X M, Li H Q, Deng B, Huang J Q and Hua X S. 2019. Quantization networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7300-7308 [DOI: 10.1109/CVPR.2019.00748http://dx.doi.org/10.1109/CVPR.2019.00748]
Yao J C, Wang F, Jia K Y, Han B, Zhou J R and Yang H X. 2021a. Device-cloud collaborative learning for recommendation//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Singapore, Singapore: ACM: 3865-3874 [DOI: 10.1145/3447548.3467097http://dx.doi.org/10.1145/3447548.3467097]
Yao J C, Zhang S Y, Yao Y, Wang F, Ma J X, Zhang J W, Chu Y F, Ji L, Jia K Y, Shen T, Wu A P, Zhang F D, Tan Z Q, Kuang K, Wu C, Wu F, Zhou J R and Yang H X. 2023. Edge-cloud polarization and collaboration: a comprehensive survey for AI. IEEE Transactions on Knowledge and Data Engineering, 35(7): 6866-6886 [DOI: 10.1109/TKDE.2022.3178211http://dx.doi.org/10.1109/TKDE.2022.3178211]
Yao L W, Huang R H, Hou L, Lu G S, Niu M Z, Xu H, Liang X D, Li Z G, Jiang X and Xu C J. 2021b. FILIP: fine-grained interactive language-image pre-training//Proceedings of the 10th International Conference on Learning Representations. [s.l.]: OpenReview.net: 1-21
Yu F X, Zhang W S, Qin Z W, Xu Z R, Wang D, Liu C C, Tian Z and Chen X. 2020. Heterogeneous federated learning [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2008.06767.pdfhttps://arxiv.org/pdf/2008.06767.pdf
Zeng A H, Liu X, Du Z X, Wang Z H, Lai H P, Ding M, Yang Z Y, Xu Y F, Zheng W D, Xia X, Tam W L, Ma Z X, Xue Y F, Zhai J D, Chen W G, Liu Z Y, Zhang P, Dong Y X and Tang J. 2022. GLM-130B: an open bilingual pre-trained model//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: OpenReview.net: 1-56
Zhang J H, Chen S Q, Liu J T and He J X. 2023a. Composing parameter-efficient modules with arithmetic operation//Proceedings of the 37th Conference on Neural Information Processing Systems. New Orleans, USA: OpenReview.net: 1-22
Zhang Q R, Chen M H, Bukharin A, Karampatziakis N, He P C, Cheng Y, Chen W Z and Zhao T. 2023b. AdaLoRA: adaptive budget allocation for parameter-efficient fine-tuning [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2303.10512.pdfhttps://arxiv.org/pdf/2303.10512.pdf
Zhang R R, Han J M, Liu C, Zhou A J, Hu X F, Yan S L, Lu P, Li H S and Qiao Y. 2023c. LlaMA-adapter: efficient fine-tuning of language models with zero-init attention [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2303.16199.pdfhttps://arxiv.org/pdf/2303.16199.pdf
Zhang S S, Roller S, Goyal N, Artetxe M, Chen M Y, Chen S H, Dewan C, Diab M, Li X, Lin X V, Mihaylov T, Ott M, Shleifer S, Shuster K, Simig D, Koura P S, Sridhar A, Wang T L and Zettlemoyer L. 2022. OPT: open pre-trained Transformer language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2205.01068v1.pdfhttps://arxiv.org/pdf/2205.01068v1.pdf
Zhao B R, Cui Q, Song R J, Qiu Y Y and Liang J J. 2022. Decoupled knowledge distillation//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11943-11952 [DOI: 10.1109/CVPR52688.2022.01165http://dx.doi.org/10.1109/CVPR52688.2022.01165]
Zhao M D, Jain S and Song S R. 2023a. RoCo: dialectic multi-robot collaboration with large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2307.04738.pdfhttps://arxiv.org/pdf/2307.04738.pdf
Zhao Q L, Wang J D, Zhang Y X, Jin Y Q, Zhu K J, Chen H and Xie X. 2023b. CompeteAI: understanding the competition behaviors in large language model-based agents [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2310.17512.pdfhttps://arxiv.org/pdf/2310.17512.pdf
Zhao Z H, Wallace E, Feng S, Klein D and Singh S. 2021. Calibrate before use: improving few-shot performance of language models//Proceedings of the 38th International Conference on Machine Learning. Virtual: PMLR: 12697-12706
Zhou L W, Palangi H, Zhang L, Hu H D, Corso J and Gao J F. 2020. Unified vision-language pre-training for image captioning and VQA//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 13041-13049 [DOI: 10.1609/aaai.v34i07.7005http://dx.doi.org/10.1609/aaai.v34i07.7005]
Zhou X, Lei X Y, Yang C, Shi Y C, Zhang X and Shi J W. 2022. Handling data heterogeneity in federated learning via knowledge distillation and fusion [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2207.11447.pdfhttps://arxiv.org/pdf/2207.11447.pdf
Zhou X K, Xu X S, Liang W, Zeng Z and Yan Z. 2021. Deep-learning-enhanced multitarget detection for end-edge-cloud surveillance in smart IoT. IEEE Internet of Things Journal, 8(16): 12588-12596 [DOI: 10.1109/JIOT.2021.3077449http://dx.doi.org/10.1109/JIOT.2021.3077449]
Zhou Y M, Yang Y Z, Ying Q C, Qian Z X and Zhang X P. 2023. Multimodal fake news detection via clip-guided learning//Proceedings of 2023 IEEE International Conference on Multimedia and Expo. Brisbane, Australia: IEEE: 2825-2830 [DOI: 10.1109/ICME55011.2023.00480http://dx.doi.org/10.1109/ICME55011.2023.00480]
Zhu L C and Yang Y. 2020. ActBERT: learning global-local video-text representations//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE: 8743-8752 [DOI: 10.1109/cvpr42600.2020.00877http://dx.doi.org/10.1109/cvpr42600.2020.00877]
Zhu X Y, Li J, Liu Y, Ma C and Wang W P. 2023a. A survey on model compression for large language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2308.07633.pdfhttps://arxiv.org/pdf/2308.07633.pdf
Zhu Y F, Niu C Y, Yan Y K, Cao Z J, Jiang H, Lyu C F, Tang S J and Wu F. 2023b. Device-unimodal cloud-multimodal collaboration for livestreaming content understanding//Proceedings of 2023 IEEE International Conference on Data Mining (ICDM). Shanghai, China: IEEE: #210 [DOI: 10.1109/ICDM58522.2023.00210http://dx.doi.org/10.1109/ICDM58522.2023.00210]
Zhu Z D, Hong J Y and Zhou J Y. 2021. Data-free knowledge distillation for heterogeneous federated learning//Proceedings of the 38th International Conference on Machine Learning. Virtual: PMLR: 12878-12889
Zou A, Wang Z F, Carlini N, Nasr M, Kolter J Z and Fredrikson M. 2023. Universal and transferable adversarial attacks on aligned language models [EB/OL]. [2023-12-31]. https://arxiv.org/pdf/2307.15043v1.pdfhttps://arxiv.org/pdf/2307.15043v1.pdf
相关文章
相关作者
相关机构