大小模型端云协同进化技术进展
Advances in edge-cloud collaboration and evolution for large-small models
- 2024年29卷第6期 页码:1510-1534
收稿日期:2024-01-09,
修回日期:2024-02-23,
纸质出版日期:2024-06-16
DOI: 10.11834/jig.240011
移动端阅览
浏览全部资源
扫码关注微信
收稿日期:2024-01-09,
修回日期:2024-02-23,
纸质出版日期:2024-06-16
移动端阅览
生成式基座大模型正在引发人工智能领域的重大变革,在自然语言处理、多模态理解与内容合成等任务展现通用能力。大模型部署于云侧提供通用智能服务,但面临时延大、个性化不足等关键挑战,小模型部署于端侧捕捉个性化场景数据,但存在泛化性不足的难题。大小模型端云协同技术旨在结合大模型通用能力和小模型专用能力,以协同交互方式学习演化进而赋能下游垂直行业场景。本文以大语言模型和多模态大模型为代表,梳理生成式基座大模型的主流架构、典型预训练技术和适配微调等方法,介绍在大模型背景下模型剪枝、模型量化和知识蒸馏等大模型小型化关键技术的发展历史和研究近况,依据模型间协作目的及协同原理异同,提出大小模型协同训练、协同推理和协同规划的协同进化分类方法,概述端云模型双向蒸馏、模块化设计和生成式智能体等系列代表性新技术、新思路。总体而言,本文从生成式基座大模型、大模型小型化技术和大小模型端云协同方式3个方面探讨大小模型协同进化的国际和国内发展现状,对比优势和差距,并从应用前景、模型架构设计、垂直领域模型融合、个性化和安全可信挑战等层面分析基座赋能发展趋势。
Generative foundation models are facilitating significant transformations in the field of artificial intelligence. They demonstrate general artificial intelligence in diverse research fields, including natural language processing, multimodal content understanding, imagery, and multimodal content synthesis. Generative foundation models often consist of billions or even hundreds of billions of parameters. Thus, they are often deployed on the cloud side to provide powerful and general intelligent services. However, this type of service can be confronted with crucial challenges in practice, such as high latency induced by communications between the cloud and local devices, and insufficient personalization capabilities due to the fact that servers often do not have access to local data considering privacy concerns. By contrast, low-complexity lightweight models are located at the edge side to capture personalized and dynamic scenario data. However, they may suffer from poor generalization. Large and lightweight (or large-small) model collaboration aims to integrate the general intelligence of large foundation models and the personalized intelligence of small lightweight models. This integration empowers downstream vertical domain-specific applications through the interaction and collaboration of both types of intelligent models. Large and small model collaboration has recently attracted increasing attention and becomes the focus of research and development in academia and industry. It has also been predicted to be an important trend in technology. We therefore try to thoroughly investigate this area by highlighting recent progress and bringing potential inspirations for related research. In this study, we first overview representative large language models (LLMs) and large multimodal models. We focus on their mainstream Transformer-based model architectures including encoder-only, decoder-only, and encoder-decoder models. Corresponding pre-training technologies such as next sentence prediction, sequence-to-sequence modeling, contrastive learning, and parameter-efficient fine-tuning methods with representatives including low-rank adaptation and prompt tuning are also explored. We then review the development history and the latest advancement of model compression techniques, including model pruning, model quantization, and knowledge distillation in the era of foundation models. Based on the differences in terms of model collaboration purposes and mechanisms, we propose a new classification method and taxonomies for the large-small model collaboration study, namely, collaborative training, collaborative inference, and collaborative planning. Specifically, we summarize recent and representative methods that consist of dual-directional knowledge distillation between large models at the cloud side and small models deployed at the edge side, modular design of intelligent models that split functional models between the cloud and edge, and generative agents that collaborate to complete more complex tasks in an autonomous and intelligent manner. In collaborative training, a main challenge is dealing with the heterogeneity in data distribution and model architectures between the cloud and client sides. Data privacy may also be a concern during collaborative training, particularly in privacy sensitive cases. Despite much progress in collaborative inference, slicing and completing a complicated task in a collective way automatically remain challenging. Furthermore, the communication costs between computing facilities might be another concern. Collective planning is a new paradigm that gains attention with the increasing study and promising progress of LLM-centric agents (LLM agents). This paradigm often involves multiple LLM agents who compete or cooperate together to complete a challenging task. It often leverages emerging capabilities such as in-context learning and chain-of-thoughts of LLMs to automatically dive a complicated task into several subtasks. By completing and assembling different subtasks, the global task can be conducted in a collaborative manner. This scheme finds diverse applications such as developing games and simulating social societies. However, it may suffer from drawbacks inherent in LLMs, including hallucination and adversarial vulnerabilities. Thus, more robust and reliable collaborative planning schemes remain to be investigated. In summary, this work surveys the large-small model collaboration techniques from the perspectives of generative foundation models, model compression, and heterogeneous model collaboration via LLM agents. This work also compares the advantages and disadvantages between international and domestic technology developments in this research realm. We conclude that, although the gaps are narrowing between domestic and advanced international studies in this area, particularly for newly emerging LLM agents, we may still lack original and major breakthroughs. Certain notable advantages of domestic progress are closely related to industrial applications due to its rich data resources from industries. Therefore, the development of domain specific LLMs is advanced. In addition, this study envisions the applications of large-small model collaboration and discusses certain key challenges and promising directions in this topic. 1) The design of efficient model architectures includes developing new model architectures that can achieve low-complexity inference speed while maintaining efficient long-sequence modeling abilities as Transformers and further improving the scalability of mixture-of-expert-based architectures. 2) Current model compression methods are mainly designed for vision models. Thus, developing techniques specifically for LLMs and large multimodal models is important to preserve their emergent abilities during compression. 3) Existing personalization methods specially focus on discriminative models, and due attention needs to be paid for efficient personalization for generative foundation models. 4) Generative intelligence often suffers from fraudulent contents (e.g., generated fake imagery, deepfake videos, and fake news) and different types of attacks (e.g., adversarial attacks, the jailing breaking attacks, and the Byzantine attacks). Thus, security and trustworthy issues arise in their practical applications. Therefore, this study also advocates a deeper investigation of these emerging security threats. Then, it develops effective defenses accordingly to countermeasure these crucial issues during large-small model collaboration for empowering vertical domains more safely.
Afonin A and Karimireddy S P . 2022 . Towards model agnostic federated learning using knowledge distillation // Proceedings of the 10th International Conference on Learning Representations . San Diego, USA : ICLR: 1 - 23
Ahn S , Hu S X , Damianou A , Lawrence N D and Dai Z . 2019 . Variational information distillation for knowledge transfer // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach, USA : IEEE: 9163 - 9171 [ DOI: 10.1109/CVPR.2019.00938 http://dx.doi.org/10.1109/CVPR.2019.00938 ]
Allen-Zhu Z and Li Y Z . 2020 . Towards understanding ensemble, knowledge distillation and self-distillation in deep learning // Proceedings of the 11th International Conference on Learning Representations . Kigali, Rwanda : OpenReview.net: 1 - 12
Asadi N and Goudarzi M . 2024 . Variant parallelism: lightweight deep convolutional models for distributed inference on IoT devices . IEEE Internet of Things Journal , 11 ( 1 ): 345 - 352 [ DOI: 10.1109/JIOT.2023.3285877 http://dx.doi.org/10.1109/JIOT.2023.3285877 ]
Banitalebi-Dehkordi A , Vedula N , Pei J , Xia F , Wang L J and Zhang Y . 2021 . Auto-split: a general framework of collaborative edge-cloud AI // Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . Singapore, Singapore : ACM: 2543 - 2553 [ DOI: 10.1145/3447548.3467078 http://dx.doi.org/10.1145/3447548.3467078 ]
Banner R , Nahshan Y and Soudry D . 2019 . Post training 4-bit quantization of convolutional networks for rapid-deployment // Proceedings of the 33rd International Conference on Neural Information Processing Systems . Vancouver, Canada : Curran Associates Inc.: #714
Bao G M and Guo P . 2022 . Federated learning in cloud-edge collaborative architecture: key technologies, applications and challenges . Journal of Cloud Computing , 11 ( 1 ): # 94 [ DOI: 10.1186/s13677-022-00377-4 http://dx.doi.org/10.1186/s13677-022-00377-4 ]
Bhardwaj R , Xia Z X , Ananthanarayanan G , Jiang J C , Shu Y C , Karianakis N , Hsieh K , Bahl P and Stoica I . 2022 . Ekya: continuous learning of video analytics models on edge compute servers // Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation . Renton, USA : USENIX Association: 119 - 135
BigScience Workshop . 2022 . BLOOM: a 176 B-parameter open-access multilingual language model [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2211.05100v1.pdf https://arxiv.org/pdf/2211.05100v1.pdf
Brown T B , Mann B , Ryder N , Subbiah M , Kaplan J D , Dhariwal P , Neelakantan A , Shyam P , Sastry G , Askell A , Agarwal S , Herbert-Voss A , Krueger G , Henighan T , Child R , Ramesh A , Ziegler D M , Wu J , Winter C , Hesse C , Chen M , Sigler E , Litwin M , Gray S , Chess B , Clark J , Berner C , McCandlish S , Radford A , Sutskever I and Amodei D . 2020 . Language models are few-shot learners // Proceedings of the 34th International Conference on Neural Information Processing Systems . Vancouver, Canada : Curran Associates Inc.: #159
Bucilua C , Caruana R and Niculescu-Mizil A . 2006 . Model compression // Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Philadelphia, USA : ACM: 535 - 541 [ DOI: 10.1145/1150402.1150464 http://dx.doi.org/10.1145/1150402.1150464 ]
Chan C M , Chen W Z , Su Y S , Yu J X , Liu Z Y , Fu J , Xue W and Zhang S H . 2023 . ChatEval: towards better LLM-based evaluators through multi-agent debate [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2308.07201.pdf https://arxiv.org/pdf/2308.07201.pdf
Chen D F , Mei J P , Wang C , Feng Y and Chen C . 2020 . Online knowledge distillation with diverse peers // Proceedings of the 37th AAAI Conference on Artificial Intelligence . New York, USA : AAAI: 3430 - 3437 [ DOI: 10.1609/aaai.v34i04.5746 http://dx.doi.org/10.1609/aaai.v34i04.5746 ]
Chen D K , Wang H B , Huo Y H , Li Y Z and Zhang H Y . 2023a . GameGPT: multi-agent collaborative framework for game development [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.08067.pdf https://arxiv.org/pdf/2310.08067.pdf
Chen G Y , Dong S W , Shu Y , Zhang G , Sesay J , Karlsson B F , Fu J and Shi Y M . 2023b . AutoAgents: a framework for automatic agent generation [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2309.17288.pdf https://arxiv.org/pdf/2309.17288.pdf
Chen P G , Liu S , Zhao H S and Jia J Y . 2021a . Distilling knowledge via knowledge review // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville, USA : IEEE: 5006 - 5015 [ DOI: 10.1109/CVPR46437.2021.00497 http://dx.doi.org/10.1109/CVPR46437.2021.00497 ]
Chen T A , Yang D N and Chen M S . 2023c . Overcoming forgetting catastrophe in quantization-aware training // Proceedings of 2023 IEEE/CVF International Conference on Computer Vision . Paris, France : IEEE: 17312 - 17321 [ DOI: 10.1109/ICCV51070.2023.01592 http://dx.doi.org/10.1109/ICCV51070.2023.01592 ]
Chen Y K , Qian S J , Tang H T , Lai X , Liu Z J , Han S and Jia J Y . 2023d . LongLoRA: efficient fine-tuning of long-context large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2309.12307.pdf https://arxiv.org/pdf/2309.12307.pdf
Chen Z Y , Yao J C , Wang F , Jia K Y , Han B , Zhang W and Yang H X . 2021b . MC 2 -SF: slow-fast learning for mobile-cloud collaborative recommendation [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2109.12314.pdf https://arxiv.org/pdf/2109.12314.pdf
Chowdhery A , Narang S , Devlin J , Bosma M , Mishra G , Roberts A , Barham P , Chung H W , Sutton C , Gehrmann S , Schuh P , Shi K S , Tsvyashchenko S , Maynez J , Rao A , Barnes P , Tay Y , Shazeer N , Prabhakaran V , Reif E , Du N , Hutchinson B , Pope R , Bradbury J , Austin J , Isard M , Gur-Ari G , Yin P C , Duke T , Levskaya A , Ghemawat S , Dev S , Michalewski H , Garcia X , Misra V , Robinson K , Fedus L , Zhou D , Ippolito D , Luan D , Lim H , Zoph B , Spiridonov A , Sepassi R , Dohan D , Agrawal S , Omernick M , Dai A M , Pillai T S , Pellat M , Lewkowycz A , Moreira E , Child R , Polozov O , Lee K , Zhou Z W , Wang X Z , Saeta B , Diaz M , Firat O , Catasta M , Wei J , Meier-Hellstern K , Eck D , Dean J , Petrov S and Fiedel N . 2023 . PaLM: scaling language modeling with pathways . Journal of Machine Learning Research , 24 ( 240 ): 1 - 13
Daga H , Chen Y W , Agrawal A and Gavrilovska A . 2023 . CLUE: systems support for knowledge transfer in collaborative learning with neural nets . IEEE Transactions on Cloud Computing , 11 ( 4 ): 3541 - 3554 [ DOI: 10.1109/TCC.2023.3294490 http://dx.doi.org/10.1109/TCC.2023.3294490 ]
Dai X , Kong X N , Guo T and Huang Y X . 2021 . CiNet: redesigning deep neural networks for efficient mobile-cloud collaborative inference // Proceedings of 2021 SIAM International Conference on Data Mining (SDM) . Philadelphia, USA : SIAM: 459 - 467 [ DOI: 10.1137/1.9781611976700.52 http://dx.doi.org/10.1137/1.9781611976700.52 ]
Dehghani M , Djolonga J , Mustafa B , Padlewski P , Heek J , Gilmer J , Steiner A , Caron M , Geirhos R , Alabdulmohsin I , Jenatton R , Beyer L , Tschannen M , Arnab A , Wang X , Riquelme C , Minderer M , Puigcerver J , Evci U , Kumar M , Van Steenkiste S , Elsayed G F , Mahendran A , Yu F , Oliver A , Huot F , Bastings J , Collier M P , Gritsenko A A , Birodkar V , Vasconcelos C , Tay Y , Mensink T , Kolesnikov A , Pavetić F , Tran D , Kipf T , Lučić M , Zhai X H , Keysers D , Harmsen J and Houlsby N . 2023 . Scaling vision Transformers to 22 billion parameters // Proceedings of the 40th International Conference on Machine Learning . Honolulu, USA : JMLR.org: #296
Denton E L , Zaremba W , Bruna J , LeCun Y and Fergus R . 2014 . Exploiting linear structure within convolutional networks for efficient evaluation // Proceedings of the 27th International Conference on Neural Information Processing Systems . Montreal, Canada : MIT Press: 1269 - 1277
Dettmers T , Lewis M , Belkada Y and Zettlemoyer L . 2022. Llm . int8(): 8-bit matrix multiplication for Transformers at scale [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2208.07339.pdf https://arxiv.org/pdf/2208.07339.pdf
Devlin J , Chang M W , Lee K and Toutanova K . 2018 . BERT: pre-training of deep bidirectional Transformers for language understanding // Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) . Minneapolis, USA : ACL: 4171 - 4186 [ DOI: 10.18653/v1/N19-1423 http://dx.doi.org/10.18653/v1/N19-1423 ]
Diao E M , Ding J and Tarokh V . 2021 . HeteroFL: computation and communication efficient federated learning for heterogeneous clients // Proceedings of the 9th International Conference on Learning Representations . San Diego, USA : OpenReview.net: 1 - 24
Ding C T , Zhou A , Liu Y X , Chang R N , Hsu C H and Wang S G . 2022a . A cloud-edge collaboration framework for cognitive service . IEEE Transactions on Cloud Computing , 10 ( 3 ): 1489 - 1499 [ DOI: 10.1109/TCC.2020.2997008 http://dx.doi.org/10.1109/TCC.2020.2997008 ]
Ding M , Yang Z Y , Hong W Y , Zheng W D , Zhou C , Yin D , Lin J Y , Zou X , Shao Z , Yang H X and Tang J . 2021 . CogView: mastering text-to-image generation via Transformers // Proceedings of the 35th Conference on Neural Information Processing Systems . Vancouver, Camada : OpenReview.net: 19822 - 19835
Ding N , Qin Y J , Yang G , Wei F C , Yang Z H , Su Y S , Hu S D , Chen Y L , Chan C M , Chen W Z , Yi J , Zhao W L , Wang X Z , Liu Z Y , Zheng H T , Chen J F , Liu Y , Tang J , Li J Z and Sun M S . 2022b . Delta tuning: a comprehensive study of parameter efficient methods for pre-trained language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2203.06904.pdf https://arxiv.org/pdf/2203.06904.pdf
Ding N , Qin Y J , Yang G , Wei F C , Yang Z H , Su Y S , Hu S D , Chen Y L , Chan C M , Chen W Z , Yi J , Zhao W L , Wang X Z , Liu Z Y , Zheng H T , Chen J F , Liu Y , Tang J , Li J Z and Sun M S . 2023 . Parameter-efficient fine-tuning of large-scale pre-trained language models . Nature Machine Intelligence , 5 ( 3 ): 220 - 235 [ DOI: 10.1038/s42256-023-00626-4 http://dx.doi.org/10.1038/s42256-023-00626-4 ]
Dong Z Q , He Q , Chen F F , Jin H , Gu T and Yang Y . 2023 . EdgeMove: pipelining device-edge model training for mobile intelligence // Proceedings of 2023 ACM Web Conference . New York, USA : ACM: 3142 - 3153 [ DOI: 10.1145/3543507.3583540 http://dx.doi.org/10.1145/3543507.3583540 ]
Du Y L , Li S , Torralba A , Tenenbaum J B and Mordatch I . 2023 . Improving factuality and reasoning in language models through multiagent debate [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2305.14325.pdf https://arxiv.org/pdf/2305.14325.pdf
Du Z X , Qian Y J , Liu X , Ding M , Qiu J Z , Yang Z L and Tang J . 2022 . GLM: general language model pretraining with autoregressive blank infilling // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Dublin, Ireland : ACL: 320 - 335 [ DOI: 10.18653/v1/2022.acl-long.26 http://dx.doi.org/10.18653/v1/2022.acl-long.26 ]
Fedus W , Zoph B and Shazeer N . 2022 . Switch Transformers: Scaling to trillion parameter models with simple and efficient sparsity . The Journal of Machine Learning Research , 23 ( 1 ): #120
Frantar E and Alistarh D . 2023 . Sparsegpt: massive language models can be accurately pruned in one-shot // Proceedings of the 40th International Conference on Machine Learning . Honolulu, USA : PMLR
Fu Y , Peng H , Khot T and Lapata M . 2023 . Improving language model negotiation with self-play and in-context learning from AI feedback // Proceedings of the 37th Conference on Neural Information Processing Systems . New Orleans, USA : OpenReview.net: 1 - 11
Gong Y C , Liu L , Yang M and Bourdev L . 2015 . Compressing deep convolutional networks using vector quantization [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/1412.6115.pdf https://arxiv.org/pdf/1412.6115.pdf
Gordon M , Duh K and Andrews N . 2020 . Compressing BERT: studying the effects of weight pruning on transfer learning // Proceedings of the 5th Workshop on Representation Learning for NLP . Virtual : ACL: 143 - 155 [ DOI: 10.18653/v1/2020.repl4nlp-1.18 http://dx.doi.org/10.18653/v1/2020.repl4nlp-1.18 ]
Gou J P , Yu B S , Maybank S J and Tao D C . 2021 . Knowledge distillation: a survey . International Journal of Computer Vision , 129 ( 6 ): 1789 - 1819 [ DOI: 10.1007/s11263-021-01453-z http://dx.doi.org/10.1007/s11263-021-01453-z ]
Gouissem A , Abualsaud K , Yaacoub E , Khattab T and Guizani M . 2023 . Collaborative byzantine resilient federated learning . IEEE Internet of Things Journal , 10 ( 18 ): 15887 - 15899 [ DOI: 10.1109/JIOT.2023.3266347 http://dx.doi.org/10.1109/JIOT.2023.3266347 ]
Gu G X , Meng X J , Lu G S , Hou L , Niu M Z , Liang X D , Yao L W , Huang R H , Zhang W , Jiang X , Xu C J and Xu H . 2022 . Wukong: a 100 million large-scale Chinese cross-modal pre-training benchmark // Proceedings of the 36th International Conference on Neural Information Processing Systems . New Orleans, USA : MIT Press: 26418 - 26431
Gu Y X , Dong L , Wei F R and Huang M L . 2023 . Knowledge distillation of large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2306.08543.pdf https://arxiv.org/pdf/2306.08543.pdf
Hamilton S . 2023 . Blind Judgement: agent-based supreme court modelling with GPT // The AAAI-23 Workshop on Creative AI Across Modalities . Washington, USA : OpenReview.net: 1 - 6
Han S , Mao H Z and Dally W J . 2016 . Deep compression : compressing deep neural networks with pruning, trained quantization and Huffman coding// Proceedings of the 4th International Conference on Learning Representations . San Juan, Puerto Rico : OpenReview.net: #149
Han S , Pool J , Tran J and Dally W . 2015 . Learning both weights and connections for efficient neural network // Proceedings of the 28th International Conference on Neural Information Processing Systems . Montréal, Canada : MIT Press: 1135 - 1143
Hao R , Hu L M , Qi W J , Wu Q L , Zhang Y R and Nie L Q . 2023 . ChatLLM network: more brains, more intelligence [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2304.12998.pdf https://arxiv.org/pdf/2304.12998.pdf
He R D , Liu L L , Ye H , Tan Q Y , Ding B S , Cheng L Y , Low J W , Bing L D and Si L . 2021 . On the effectiveness of adapter-based tuning for pretrained language model adaptation //Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. [s.l.]: Association for Computational Linguistics: 2208 - 2222 [ DOI: 10.18653/v1/2021.acl-long.172 http://dx.doi.org/10.18653/v1/2021.acl-long.172 ]
He Y H , Zhang X Y and Sun J . 2017 . Channel pruning for accelerating very deep neural networks // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice, Italy : IEEE: 1398 - 1406 [ DOI: 10.1109/ICCV.2017.155 http://dx.doi.org/10.1109/ICCV.2017.155 ]
Hinton G , Vinyals O and Dean J . 2015 . Distilling the knowledge in a neural network [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/1503.02531.pdf https://arxiv.org/pdf/1503.02531.pdf
Ho J , Jain A and Abbeel P . 2020 . Denoising diffusion probabilistic models // Proceedings of the 34th International Conference on Neural Information Processing Systems . Vancouver, Canada : Curran Associates Inc.: #574
Ho N , Schmid L and Yun S Y . 2023 . Large language models are reasoning teachers // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics . Toronto, Canada : ACL: 14852 - 14882 [ DOI: 10.18653/v1/2023.acl-long.830 http://dx.doi.org/10.18653/v1/2023.acl-long.830 ]
Houlsby N , Giurgiu A , Jastrzebski S , Morrone B , De Laroussilhe Q , Gesmundo A , Attariyan M and Gelly S . 2019 . Parameter-efficient transfer learning for NLP // Proceedings of the 36th International Conference on Machine Learning . Long Beach, USA : ICML: 2790 - 2799
Hu E J , Shen Y L , Wallis P , Allen-Zhu Z Y , Li Y Z , Wang S A , Wang L and Chen W Z . 2022 . Lora: low-rank adaptation of large language models // Proceedings of the 10th International Conference on Learning Representations . Virtual : OpenReview.net: 1 - 26
Huang C S , Liu Q , Lin B Y , Du C , Pang T Y and Lin M . 2023 . LoraHub: efficient cross-task generalization via dynamic lora composition // Proceedings of the 12th International Conference on Learning Representations . OpenReview . net : 1 - 20
Huang Y K , Chen Y D , Yu Z and McKeown K . 2022 . In-context learning distillation: transferring few-shot learning ability of pre-trained language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2212.10670.pdf https://arxiv.org/pdf/2212.10670.pdf
InternLM Team . 2023 . InternLM: a multilingual language model with progressively enhanced capabilities [EB/OL]. [ 2023-12-31 ]. https://github.com/InternLM/InternLM-techreport/blob/main/InternLM.pdf https://github.com/InternLM/InternLM-techreport/blob/main/InternLM.pdf
Jacob B , Kligys S , Chen B , Zhu M L , Tang M , Howard A , Adam H and Kalenichenko D . 2018 . Quantization and training of neural networks for efficient integer-arithmetic-only inference // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City, USA : IEEE: 2704 - 2713 [ DOI: 10.1109/CVPR.2018.00286 http://dx.doi.org/10.1109/CVPR.2018.00286 ]
Jacobs R A , Jordan M I , Nowlan S J and Hinton G E . 1991 . Adaptive mixtures of local experts . Neural Computation , 3 ( 1 ): 79 - 87 [ DOI: 10.1162/neco.1991.3.1.79 http://dx.doi.org/10.1162/neco.1991.3.1.79 ]
Jain N , Schwarzschild A , Wen Y X , Somepalli G , Kirchenbauer J , Chiang P Y , Goldblum M , Saha A , Geiping J and Goldstein T . 2023 . Baseline defenses for adversarial attacks against aligned language models //Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview .net: 1 - 22
Ji M , Heo B and Park S . 2021 . Show, attend and distill: knowledge distillation via attention-based feature matching // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI: 7945 - 7952 [ DOI: 10.1609/aaai.v35i9.16969 http://dx.doi.org/10.1609/aaai.v35i9.16969 ]
Jiang P H , Xin K , Li C X and Zhou Y S . 2023 . High-efficiency device-cloud collaborative Transformer model // Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver, Canada : IEEE: 2204 - 2210 [ DOI: 10.1109/cvprw59228.2023.00214 http://dx.doi.org/10.1109/cvprw59228.2023.00214 ]
Ko J H , Na T , Amir M F , Mukhopadhyay S . 2018 . Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms // Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance . Auckland, New Zealand : IEEE: 1 - 6 [ DOI: 10.1109/AVSS.2018.8639121 http://dx.doi.org/10.1109/AVSS.2018.8639121 ]
Lan Z Z , Chen M D , Goodman S , Gimpel K , Sharma P and Soricut R . 2019 . ALBERT: a lite BERT for self-supervised learning of language representations // Proceedings of the 8th International Conference on Learning Representations . Addis Ababa, Ethiopia : OpenReview.net: 1 - 17
Lepikhin D , Lee H , Xu Y Z , Chen D H , Firat O , Huang Y P , Krikun M , Shazeer N and Chen Z F . 2020 . GShard: scaling giant models with conditional computation and automatic sharding // Proceedings of 2020 International Conference on Learning Representations (ICLR) . Addis Ababa, Ethiopia : ICLR: 1 - 35 [DOI: 10.18653/v1/2020.iclr-1.1].
Lester B , Al-Rfou R and Constant N . 2021 . The power of scale for parameter-efficient prompt tuning // Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing . Punta Cana, Dominican Republic : ACL: 3045 - 3059 [ DOI: 10.18653/v1/2021.emnlp-main.243 http://dx.doi.org/10.18653/v1/2021.emnlp-main.243 ]
Lewis M , Liu Y H , Goyal N , Ghazvininejad M , Mohamed A , Levy O , Stoyanov V and Zettlemoyer L . 2020 . BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension //Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [s.l.]: ACL: 7871 - 7880 [ DOI: 10.18653/v1/2020.acl-main.703 http://dx.doi.org/10.18653/v1/2020.acl-main.703 ]
Liao Z , Quétu V , Nguyen VT and Tartaglione E . 2023 . Can Unstructured pruning reduce the depth in deep neural networks? // Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris, France : IEEE/CVF: 402 - 1406 [ DOI: 10.1109/ICCVW60793.2023.00151 http://dx.doi.org/10.1109/ICCVW60793.2023.00151 ]
Li D L and Wang J P . 2019 . FedMD: heterogenous federated learning via model distillation [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/1910.03581.pdf https://arxiv.org/pdf/1910.03581.pdf
Li G H , Hammoud H A A K , Itani H , Khizbullin D and Ghanem B . 2023a . Camel: communicative agents for “mind” exploration of large language model society // Proceedings of the 37th Conference on Neural Information Processing Systems . New Orleans, USA : OpenReview.net: 1 - 18
Li H , Kadav A , Durdanovic I , Samet H and Graf H P . 2017 . Pruning filters for efficient ConvNets // Proceedings of the 5th International Conference on Learning Representations . Toulon, France : ICLR: 1 - 13
Li H , Zhu J G , Jiang X H , Zhu X Z , Li H S , Yuan C , Wang X H , Qiao Y , Wang X G , Wang W H and Dai J F . 2023b . Uni-perceiver v2: a generalist model for large-scale vision and vision-language tasks // Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver, Canada : IEEE: 2691 - 2700 [ DOI: 10.1109/CVPR52729.2023.00264 http://dx.doi.org/10.1109/CVPR52729.2023.00264 ]
Li H S , Hu C H , Jiang J Y , Wang Z , Wen Y G and Zhu W W . 2018 . JALAD: joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution // Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems . Singapore, Singapore : IEEE: 671 - 678 [ DOI: 10.1109/PADSW.2018.8645013 http://dx.doi.org/10.1109/PADSW.2018.8645013 ]
Li J N , Li D X , Savarese S and Hoi S . 2023c . BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models // Proceedings of the 40th International Conference on Machine Learning . Honolulu, USA : JMLR.org: #814
Li J N , Li D X , Xiong C M and Hoi S C H . 2022a . BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation // Proceedings of the 39th International Conference on Machine Learning . Baltimore, USA : PMLR: 12888 - 12900
Li S Y , Chen J S , Shen Y L , Chen Z Y , Zhang X L , Li Z K , Wang H , Qian J , Peng B L , Mao Y , Chen W H and Yan X F . 2022b . Explanations from large language models make small reasoners better [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2210.06726.pdf https://arxiv.org/pdf/2210.06726.pdf
Li T , Sahu A K , Zaheer M , Sanjabi M , Talwalkar A and Smith V . 2020 . Federated optimization in heterogeneous networks // Proceedings of Machine Learning and Systems 2020 . Austin, USA : mlsys.org , 2020 : 429 - 450
Li X L and Liang P . 2021 . Prefix-tuning: optimizing continuous prompts for generation // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . ACL : 4582 - 4597 [ DOI: 10.18653/v1/2021.acl-long.353 http://dx.doi.org/10.18653/v1/2021.acl-long.353 ]
Li Y , Zhang Y X and Sun L C . 2023e . MetaAgents: simulating interactions of human behaviors for LLM-based task-oriented coordination via collaborative generative agents [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.06500.pdf https://arxiv.org/pdf/2310.06500.pdf
Li Y W , Adamczewski K , Li W , Gu S H , Timofte R and Van Gool L . 2022c . Revisiting random channel pruning for neural network compression // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans, USA : IEEE: 191 - 201 [ DOI: 10.1109/CVPR52688.2022.00029 http://dx.doi.org/10.1109/CVPR52688.2022.00029 ]
Li Y X , Yu Y F , Liang C , He P C , Karampatziakis N , Chen W Z and Zhao T . 2023d . LoftQ: LoRA-fine-tuning-aware quantization for large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.08659.pdf https://arxiv.org/pdf/2310.08659.pdf
Li Z , Li X , Yang L F , Zhao B R , Song R J , Luo L , Li J and Yang J . 2023h . Curriculum temperature for knowledge distillation // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI Press: 1504 - 1512 [ DOI: 10.1609/aaai.v37i2.25236 http://dx.doi.org/10.1609/aaai.v37i2.25236 ]
Li Z K , Xiao J R , Yang L W and Gu Q Y . 2023f . RepQ-ViT: scale reparameterization for post-training quantization of vision Transformers // Proceedings of 2023 IEEE/CVF International Conference on Computer Vision . Paris, France : IEEE: 17181 - 17190 [ DOI: 10.1109/ICCV51070.2023.01580 http://dx.doi.org/10.1109/ICCV51070.2023.01580 ]
Li Z X , Li Q W , Zhou Y , Zhong W L , Zhang G N and Wu C . 2023g . Edge-cloud collaborative learning with federated and centralized features // Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval . Taipei, China : ACM: 1949 - 1953 [ DOI: 10.1145/3539618.3591976 http://dx.doi.org/10.1145/3539618.3591976 ]
Liang T , He Z W , Jiao W X , Wang X , Wang Y , Wang R , Yang Y J , Tu Z P and Shi S M . 2023 . Encouraging divergent thinking in large language models through multi-agent debate [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2305.19118.pdf https://arxiv.org/pdf/2305.19118.pdf
Lin B Y , Fu Y C , Yang K , Ammanabrolu P , Brahman F , Huang S Y , Bhagavatula C , Choi Y and Ren X . 2023 . SwiftSage: a generative agent with fast and slow thinking for complex interactive tasks // 37th Interactive Learning with Implicit Human Feedback Workshop at ICML 2023 . New Orleans, USA : OpenReview.net: 1 - 18
Lin J Y , Men R , Yang A , Zhou C , Ding M , Zhang Y C , Wang P , Wang A , Jiang L , Jia X Y , Zhang J , Zhang J W , Zou X , Li Z K , Deng X D , Liu J , Xue J B , Zhou H L , Ma J X , Yu J , Li Y , Lin W , Zhou J R , Tang J and Yang H X . 2021 . M 6 : a Chinese multimodal pretrainer [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2103.00823.pdf https://arxiv.org/pdf/2103.00823.pdf
Liu J , Zhuang B H , Zhuang Z W , Guo Y , Huang J Z , Zhu J H and Tan M K . 2022a . Discrimination-aware network pruning for deep model compression . IEEE Transactions on Pattern Analysis and Machine Intelligence , 44 ( 8 ): 4035 - 4051 [ DOI: 10.1109/TPAMI.2021.3066410 http://dx.doi.org/10.1109/TPAMI.2021.3066410 ]
Liu J W , Niu L , Yuan Z H , Yang D W , Wang X G and Liu W Y . 2023a . PD-Quant: post-training quantization based on prediction difference metric // Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver, Canada : IEEE: 24427 - 24437 [ DOI: 10.1109/CVPR52729.2023.02340 http://dx.doi.org/10.1109/CVPR52729.2023.02340 ]
Liu X , Ji K X , Fu Y C , Tam W L , Du Z X , Yang Z L and Tang J . 2022c . P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) . Dublin, Ireland : ACL: 61 - 68 [ DOI: 10.18653/v1/2022.acl-short.8 http://dx.doi.org/10.18653/v1/2022.acl-short.8 ]
Liu X , Zheng Y N , Du Z X , Ding M , Qian Y J , Yang Z L and Tang J . 2021 . GPT understands, too [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2103.10385.pdf https://arxiv.org/pdf/2103.10385.pdf
Liu Y H , Ott M , Goyal N , Du J F , Joshi M , Chen D Q , Levy O , Lewis M , Zettlemoyer L and Stoyanov V . 2019 . RoBERTa: a robustly optimized BERT pretraining approach [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/1907.11692.pdf https://arxiv.org/pdf/1907.11692.pdf
Liu Z C , Oguz B , Zhao C S , Chang E , Stock P , Mehdad Y , Shi Y Y , Krishnamoorthi R and Chandra V . 2023b . LLM-QAT: data-free quantization aware training for large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2305.17888.pdf https://arxiv.org/pdf/2305.17888.pdf
Liu Z J , Zhang Y Z , Li P , Liu Y and Yang D Y . 2024 . Dynamic LLM-agent network: an LLM-agent collaboration framework with agent team optimization //Proceedings of the 12th International Conference on Learning Representations [s.l.]: OpenReview .net: 1 - 22
Lu Y , Shu Y C , Tan X , Liu Y X , Zhou M Y , Chen Q and Pei D . 2019a . Collaborative learning between cloud and end devices: an empirical study on location prediction // Proceedings of the 4th ACM/IEEE Symposium on Edge Computing . Arlington Virginia, USA : Association for Computing Machinery: 139 - 151 [ DOI: 10.1145/3318216.3363304 http://dx.doi.org/10.1145/3318216.3363304 ]
Lu J S , Batra D , Parikh D and Lee S . 2019b . ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks // Proceedings of the 33rd International Conference on Neural Information Processing Systems . Vancouver, Canada : Curran Associates Inc. 2
Luo J H , Wu J X and Lin W Y . 2017 . ThiNet: a filter level pruning method for deep neural network compression // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice, Italy : IEEE: 5068 - 5076 [ DOI: 10.1109/ICCV.2017.541 http://dx.doi.org/10.1109/ICCV.2017.541 ]
Lyu C F , Niu C Y , Gu R J , Jiang X T , Wang Z D , Liu B , Wu Z Q , Yao Q L , Huang C Y , Huang P , Huang T , Shu H , Song J D , Zou B , Lan P , Xu G H , Wu F , Tang S J , Wu F and Chen G H . 2022 . Walle : an end-to-end, general-purpose, and large-scale production system for device-cloud collaborative machine learning// Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation . Carlsbad, USA : USENIX Association: 1 - 22
Lyu Z Q , Zhang W Q , Zhang S Y , Kuang K , Wang F , Wang Y W , Chen Z Y , Shen T , Yang H X , Ooi B C and Wu F . 2023 . DUET: a tuning-free device-cloud collaborative parameters generation framework for efficient device model generalization // Proceedings of 2023 ACM Web Conference . Austin, USA : ACM: 3077 - 3085 [ DOI: 10.1145/3543507.3583451 http://dx.doi.org/10.1145/3543507.3583451 ]
Ma X Y , Fang G F and Wang X C . 2023a . LLM-Pruner: on the structural pruning of large language models // Proceedings of the 37th Conference on Neural Information Processing Systems . New Orleans, USA : OpenReview.net: 1 - 19
Ma X Y , Jeong S , Zhang M J , Wang D , Choi J and Jeon M . 2023b . Cost-effective on-device continual learning over memory hierarchy with Miro // Proceedings of the 29th Annual International Conference on Mobile Computing and Networking . Madrid, Spain : ACM: #83 [ DOI: 10.1145/3570361.3613297 http://dx.doi.org/10.1145/3570361.3613297 ]
Madan K , Ke R N , Goyal A , Schölkopf B B and Bengio Y . 2021 . Fast and slow learning of recurrent independent mechanisms //Proceedings of the 9th International Conference on Learning Representations. [s.l.]: OpenReview .net: 1 - 17
Manakul P , Liusie A and Gales M J F . 2023 . SelfCheckGPT: zero-resource black-box hallucination detection for generative large language models // Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing . Singapore, Singapore : ACL: 9004 - 9017 [ DOI: 10.18653/v1/2023.emnlp-main.557 http://dx.doi.org/10.18653/v1/2023.emnlp-main.557 ]
McMahan B , Moore E , Ramage D , Hampson S and Arcas B A Y . 2017 . Communication-efficient learning of deep networks from decentralized data // Proceedings of the 20th International Conference on Artificial Intelligence and Statistics . Fort Lauderdale, USA : PMLR: 1273 - 1282
Mitchell E , Lee Y , Khazatsky A , Manning C D and Finn C . 2023 . DetectGPT: zero-shot machine-generated text detection using probability curvature // Proceedings of the 40th International Conference on Machine Learning . Honolulu, USA : JMLR.org: #1038
Nair V , Schumacher E , Tso G and Kannan A . 2023 . DERA: enhancing large language model completions with dialog-enabled resolving agents [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2303.17071.pdf https://arxiv.org/pdf/2303.17071.pdf
Nan Y , Jiang S Q and Li M . 2024 . Large-scale video analytics with cloud-edge collaborative continuous learning . ACM Transactions on Sensor Networks , 20 ( 1 ): # 14 [ DOI: 10.1145/3624478 http://dx.doi.org/10.1145/3624478 ]
Narayanan D , Shoeybi M , Casper J , LeGresley P , Patwary M , Korthikanti V , Vainbrand D , Kashinkunti P , Bernauer J , Catanzaro B , Phanishayee A and Zaharia M . 2021 . Efficient large-scale language model training on GPU clusters using megatron-LM // SC21: International Conference for High Performance Computing, Networking, Storage and Analysis . St. Louis, USA : IEEE: #58 [ DOI: 10.1145/3458817.3476209 http://dx.doi.org/10.1145/3458817.3476209 ]
Ouyang L , Wu J , Jiang X , Almeida D , Wainwright C L , Mishkin P , Zhang C , Agarwal S , Slama K , Ray A , Schulman J , Hilton J , Kelton F , Miler L , Simens M , Askell A , Welinder P , l Christiano P , Leike J and Lowe R . 2022 . Training language models to follow instructions with human feedback [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2203.02155.pdf https://arxiv.org/pdf/2203.02155.pdf
Niu C Y , Wu F , Tang S J , Hua L F , Jia R F , Lyu C F , Wu Z H and Chen G H . 2020 . Billion-scale federated learning on mobile clients: a submodel design with tunable privacy // Proceedings of the 26th Annual International Conference on Mobile Computing and Networking . London, UK : ACM: #31 [ DOI: 10.1145/3372224.3419188 http://dx.doi.org/10.1145/3372224.3419188 ]
Pacheco R G , Couto R S and Simeone O . 2021 . Calibration-aided edge inference offloading via adaptive model partitioning of deep neural networks // Proceedings of 2021 IEEE International Conference on Communications . Montreal, Canada : IEEE: 1 - 6 [ DOI: 10.1109/ICC42927.2021.9500760 http://dx.doi.org/10.1109/ICC42927.2021.9500760 ]
Padmanabhan A , Iyer A P , Ananthanarayanan G , Shu Y C , Karianakis N , Xu G H and Netravali R . 2021 . Towards memory-efficient inference in edge video analytics // Proceedings of the 3rd ACM Workshop on Hot Topics in Video Analytics and Intelligent Edges . New York, USA : ACM: 31 - 37 [ DOI: 10.1145/3477083.3480150 http://dx.doi.org/10.1145/3477083.3480150 ]
Park J , Min B , Ma X J and Kim J . 2023 . ChoiceMates: supporting unfamiliar online decision-making with multi-agent conversational interactions [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.01331.pdf https://arxiv.org/pdf/2310.01331.pdf
Park W , Kim D , Lu Y and Cho M . 2019 . Relational knowledge distillation // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach,USA : IEEE: 3967 - 3976 [ DOI: 10.1109/CVPR.2019.00409 http://dx.doi.org/10.1109/CVPR.2019.00409 ]
Passalis N and Tefas A . 2018 . Learning deep representations with probabilistic knowledge transfer // Proceedings of the 15th European Conference on Computer Vision . Munich, Germany : Springer: 283 - 299 [ DOI: 10.1007/978-3-030-01252-6_17 http://dx.doi.org/10.1007/978-3-030-01252-6_17 ]
Pham C , Liu B Y , Yang Y X , Chen Z Y , Liu T Y , Yuan J B , Plummer B A , Wang Z R and Yang H X . 2023 . Let models speak ciphers: multiagent debate through embeddings [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.06272.pdf https://arxiv.org/pdf/2310.06272.pdf
Qi X Y , Huang K X , Panda A , Wang M D and Mittal P . 2023 . Visual adversarial examples jailbreak aligned large language models //The 2nd Workshop on New Frontiers in Adversarial Machine Learning. [s.l.]: OpenReview . net :
Radford A , Kim J W , Hallacy C , Ramesh A , Goh G , Agarwal S , Sastry G , Askell A , Mishkin P , Clark J , Krueger G and Sutskever I . 2021 . Learning transferable visual models from natural language supervision // Proceedings of the 38th International Conference on Machine Learning . Online : PMLR: 8748 - 8763
Radford A , Narasimhan K , Salimans T and Sutskever I . 2018 . Improving language understanding by generative pre-training [EB/OL]. [ 2023-12-31 ]. https://www.mikecaptain.com/resources/pdf/GPT-1.pdf https://www.mikecaptain.com/resources/pdf/GPT-1.pdf
Raffel C , Shazeer N , Roberts A , Lee K , Narang S , Matena M , Zhou Y Q , Li W and Liu P J . 2020 . Exploring the limits of transfer learning with a unified text-to-text Transformer . The Journal of Machine Learning Research , 21 ( 1 ): #140
Ramesh A , Pavlov M , Goh G , Gray S , Voss C , Radford A , Chen M and Sutskever I . 2021 . Zero-shot text-to-image generation // Proceedings of the 38th International Conference on Machine Learning . Online : ICML: 8821 - 8831
Rawte V , Sheth A and Das A . 2023 . A survey of hallucination in large foundation models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2303.08896.pdf https://arxiv.org/pdf/2303.08896.pdf
Rombach R , Blattmann A , Lorenz D , Esser P and Ommer B . 2022 . High-resolution image synthesis with latent diffusion models // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans, USA : IEEE: 10674 - 10685 [ DOI: 10.1109/CVPR52688.2022.01042 http://dx.doi.org/10.1109/CVPR52688.2022.01042 ]
Romero A , Ballas N , Kahou S E , Chassang A , Gatta C and Bengio Y . 2015 . FitNets: hints for thin deep nets // Proceedings of the 3rd International Conference on Learning Representations . San Diego, USA : ICLR: 1 - 14
Ruiz N , Li Y Z , Jampani V , Wei W , Hou T B , Pritch Y , Wadhwa N , Rubinstein M and Aberman K . 2023 . Hyperdreambooth: hypernetworks for fast personalization of text-to-image models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2307.06949.pdf https://arxiv.org/pdf/2307.06949.pdf
Saharia C , Chan W , Saxena S , Li L L , Whang J , Denton E L , Ghasemipour S K S , Gontijo-Lopes R , Ayan B K , Salimans T , Ho J , Fleet D J and Norouzi M . 2022 . Photorealistic text-to-image diffusion models with deep language understanding // Proceedings of the 36th Conference on Neural Information Processing Systems . New Orleans, USA : OpenReview.net: 1 - 16
Shao S J , Shao C Z , Zhong C , Guo S Y and Lu P C . 2022 . Cloud-edge collaboration based power IoT scene perception mechanism // Proceedings of the 11th International Conference on Game Theory for Networks . Virtual : Springer: 100 - 117 [ DOI: 10.1007/978-3-031-23141-4_8 http://dx.doi.org/10.1007/978-3-031-23141-4_8 ]
Shazeer N , Mirhoseini A , Maziarz K , Davis A , Le Q , Hinton G and Dean J . 2017 . Outrageously large neural networks: the sparsely-gated mixture-of-experts layer // Proceedings of the 5th International Conference on Learning Representations . Toulon, France : OpenReview.net: 1 - 19
Stock P , Fan A , Graham B , Grave E , Gribonval R , Jegou H and Joulin A . 2022 . Training with quantization noise for extreme model compression // Proceedings of the 9th International Conference on Learning Representations . San Diego : OpenReview.net: 19123 - 19138
Su W J , Zhu X Z , Cao Y , Li B , Lu L W , Wei F R and Dai J F . 2020 . VL-BERT: pre-training of generic visual-linguistic representations // Proceedings of the 8th International Conference on Learning Representations . Addis Ababa, Ethiopia : OpenReview.net
Sui C H , Wang A , Zhou S W , Zang A K , Pan Y H , Liu H and Wang H P . 2023 . A survey on adversarial training for robust learning . Journal of Image and Graphics , 28 ( 12 ): 3629 - 3650
隋晨红 , 王奥 , 周圣文 , 臧安康 , 潘云豪 , 刘颢 , 王海鹏 . 2023 . 面向鲁棒学习的对抗训练技术综述 . 中国图象图形学报 , 28 ( 12 ): 3629 - 3650 [ DOI: 10.11834/jig.220953 http://dx.doi.org/10.11834/jig.220953 ]
Sun C , Myers A , Vondrick C , Murphy K and Schmid C . 2019 . VideoBERT: a joint model for video and language representation learning // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision . Seoul, Korea (South) : IEEE: 7463 - 7472 [ DOI: 10.1109/ICCV.2019.00756 http://dx.doi.org/10.1109/ICCV.2019.00756 ]
Sun Y , Wang S H , Feng S K , Ding S Y , Pang C , Shang J Y , Liu J X , Chen X Y , Zhao Y B , Lu Y X , Liu W X , Wu Z H , Gong W B , Liang J Z , Shang Z Z , Sun P , Liu W , Yang X O , Yu D H , Tian H , W H and Wang H F . 2021 . ERNIE 3 . 0 : large-scale knowledge enhanced pre-training for language understanding and generation [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2107.02137.pdf https://arxiv.org/pdf/2107.02137.pdf
Sun Y T , Dong L , Huang S H , Ma S M , Xia Y Q , Xue J L , Wang J Y and Wei F R . 2024 . Retentive network: a successor to Transformer for large language models //Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview .net: 1 - 14
Sung Y L , Yoon J H and Bansal M . 2023 . ECoFLaP: Efficient coarse-to-fine layer-wise pruning for vision-language models [EB/OL].[ 2023-12-31 ]. https://arxiv.org/pdf/2310.02998.pdf https://arxiv.org/pdf/2310.02998.pdf
Tao M , Bao B K , Tang H and Xu C S . 2023 . GALIP: generative adversarial CLIPs for text-to-image synthesis // Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver, Canada : IEEE: 14214 - 14223 [ DOI: 10.1109/CVPR52729.2023.01366 http://dx.doi.org/10.1109/CVPR52729.2023.01366 ]
Tian Y , Yang X , Zhang J Y , Dong Y P and Su H . 2023 . Evil geniuses: delving into the safety of LLM-based agents [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2311.11855.pdf https://arxiv.org/pdf/2311.11855.pdf
Touvron H , Lavril T , Izacard G , Martinet X , Lachaux M A , Lacroix T , Rozière B , Goyal N , Hambro E , Azhar F , Rodriguez A , Joulin A , Grave E and Lample G . 2023 . Llama: open and efficient foundation language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2302.13971.pdf https://arxiv.org/pdf/2302.13971.pdf
Vaswani A , Shazeer N , Parmar N , Uszkoreit J , Jones L , Gomez A N , Kaiser Ł and Polosukhin I . 2017 . Attention is all you need // Proceedings of the 31st International Conference on Neural Information Processing Systems . Long Beach, USA : Curran Associates Inc.: 6000 - 6010
Wang Y L , Zhang X L , Xie L X , Zhou J , Su H , Zhang B and Hu X L . 2020 . Pruning from scratch // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI Press: 12273 - 12280 [ DOI: 10.1609/aaai.v34i07.6910 http://dx.doi.org/10.1609/aaai.v34i07.6910 ]
Wang Y W , Ding X , Yang Y X , Ding L , Ward R and Wang Z J . 2021 . Perception matters: exploring imperceptible and transferable anti-forensics for GAN-generated fake face imagery detection . Pattern Recognition Letters , 146 : 15 - 22 [ DOI: 10.1016/j.patrec.2021.03.009 http://dx.doi.org/10.1016/j.patrec.2021.03.009 ]
Wang Y W , Liu Y and Shen Z Q . 2023a . Revisiting item promotion in GNN-based collaborative filtering: a masked targeted topological attack perspective // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI Press: 15206 - 15214 [ DOI: 10.1609/aaai.v37i12.26774 http://dx.doi.org/10.1609/aaai.v37i12.26774 ]
Wang Y W , Wang Y H , Cai J Y , Lee T K , Miao C Y and Wang Z J . 2023b . SSD-KD: a self-supervised diverse knowledge distillation method for lightweight skin lesion classification using dermoscopic images . Medical Image Analysis , 84 : # 102693 [ DOI: 10.1016/j.media.2022.102693 http://dx.doi.org/10.1016/j.media.2022.102693 ]
Wang Z H L , Mao S G , Wu W S , Ge T , Wei F R and Ji H . 2023c . Unleashing the Emergent cognitive synergy in large language models: a task-solving agent through multi-persona self-collaboration [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2307.05300.pdf https://arxiv.org/pdf/2307.05300.pdf
Wei H L , Zhang H , Ai-Haddad K and Shi Y . 2023a . Ensuring secure platooning of constrained Intelligent and connected vehicles against Byzantine attacks: a distributed MPC framework . Engineering : # 007 [ DOI: 10.1016/j.eng.2023.10.007 http://dx.doi.org/10.1016/j.eng.2023.10.007 ]
Wei Z P , Chen J J , Wu Z X and Jiang Y G . 2023b . Adaptive cross-modal transferable adversarial attacks from images to videos . IEEE Transactions on Pattern Analysis and Machine Intelligence : #3347835 [DOI: 10.1109/TPAMI.2023.3347835] .
Wu C F , Liang J , Ji L , Yang F , Fang Y J , Jiang D X and Duan N . 2022 . NÜWA: visual synthesis pre-training for neural visual world creation // Proceedings of the 17th European Conference on Computer Vision . Tel Aviv, Israel : Springer: 720 - 736 [ DOI: 10.1007/978-3-031-19787-1_41 http://dx.doi.org/10.1007/978-3-031-19787-1_41 ]
Wu H Z , Zhang J , Li Y , Yin Z X , Zhang X P , Tian H , Li B , Zhang W M and Yu N H . 2023 . Overview of artificial intelligence model watermarking . Journal of Image and Graphics , 28 ( 6 ): 1792 - 1810
吴汉舟 , 张杰 , 李越 , 殷赵霞 , 张新鹏 , 田晖 , 李斌 , 张卫明 , 俞能海 . 2023 . 人工智能模型水印研究进展 . 中国图象图形学报 , 28 ( 6 ): 1792 - 1810 [ DOI: 10.11834/jig.230010 http://dx.doi.org/10.11834/jig.230010 ]
Wu Q Y , Bansal G G , Zhang J Y , Wu Y R , Li B B , Zhu E K , Jiang L , Zhang X Y , Zhang S K , Liu J L , Awadallah A H , White R W , Burger D and Wang C . 2023 . AutoGen: enabling next-gen LLM applications via multi-agent conversation //Proceedings of the 12th International Conference on Learning Representations. [s.l.]: OpenReview .net: 1 - 43
Xiao G X , Lin J , Seznec M , Wu H , Demouth J and Han S . 2023 . SmoothQuant: accurate and efficient post-training quantization for large language models // Proceedings of the 40th International Conference on Machine Learning . Honolulu, USA : ICML: 38087 - 38099
Xie Y Q , Yi J W , Shao J W , Curl J , Lyu L J , Chen Q F , Xie X and Wu F Z . 2023 . Defending ChatGPT against jailbreak attack via self-reminders . Nature Machine Intelligence , 5 ( 12 ): 1486 - 1496 [ DOI: 10.1038/s42256-023-00765-8 http://dx.doi.org/10.1038/s42256-023-00765-8 ]
Xiong K , Ding X , Cao Y X , Liu T and Qin B . 2023 . Examining inter-consistency of large language models collaboration: an in-depth analysis via debate // Findings of the Association for Computational Linguistics: EMNLP 2023 . Singapore, Singapore : ACL: #508 [ DOI: 10.18653/v1/2023.findings-emnlp.508 http://dx.doi.org/10.18653/v1/2023.findings-emnlp.508 ]
Xu G D , Liu Z W , Li X X and Loy C C . 2020a . Knowledge distillation meets self-supervision // Proceedings of the 16th European Conference on Computer Vision . Glasgow, UK : Springer: 588 - 604 [ DOI: 10.1007/978-3-030-58545-7_34 http://dx.doi.org/10.1007/978-3-030-58545-7_34 ]
Xu S K , Li H K , Zhuang B H , Liu J , Cao J Z , Liang C R and Tan M K . 2020b . Generative low-bitwidth data free quantization // Proceedings of the 16th European Conference on Computer Vision . Glasgow, UK : Springer: 1 - 17 [ DOI: 10.1007/978-3-030-58610-2_1 http://dx.doi.org/10.1007/978-3-030-58610-2_1 ]
Xu Z C , Zhao L Q , Liang W F , Rana O F , Zhou P , Xia Q F , Xu W Z and Wu G W . 2021 . Energy-aware inference offloading for DNN-driven applications in mobile edge clouds . IEEE Transactions on Parallel and Distributed Systems , 32 ( 4 ): 799 - 814 [ DOI: 10.1109/TPDS.2020.3032443 http://dx.doi.org/10.1109/TPDS.2020.3032443 ]
Yan Y K , Niu C Y , Gu R J , Wu F , Tang S J , Hua L F , Lyu C F and Chen G H . 2022 . On-device learning for model personalization with large-scale cloud-coordinated domain adaption // Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . Washington, USA : ACM: 2180 - 2190 [ DOI: 10.1145/3534678.3539263 http://dx.doi.org/10.1145/3534678.3539263 ]
Yang A Y , Xiao B , Wang B G , Zhang B R , Bian C , Yin C , Lyu C X , Pan D , Wang D , Yan D , Yang F , Deng F , Wang F , Liu F , Ai G W , Dong G S , Zhao H Z , Xu H , Sun H Z , Zhang H D , Liu H , Ji J M , Xie J , Dai J T , Fang K , Su L , Song L , Liu L F , Ru L Y , o Ma L Y , Wang M , Liu M , Lin M A , Nie N L , Guo P D , Sun R Y , Zhang T , Li T P , Li T Y , Cheng W , Chen W P , Zeng X R , Wang X C , Chen X X , Men X , Yu X , Pan X H , Shen Y J , Wang Y D , Li Y Y , Jiang Y X , Gao Y C , Zhang Y P , Zhou Z N and Wu Z Y . 2023 . Baichuan 2: open large-scale language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2309.10305v1.pdf https://arxiv.org/pdf/2309.10305v1.pdf
Yang J W , Shen X , Xing J , Tian X M , Li H Q , Deng B , Huang J Q and Hua X S . 2019 . Quantization networks // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach, USA : IEEE: 7300 - 7308 [ DOI: 10.1109/CVPR.2019.00748 http://dx.doi.org/10.1109/CVPR.2019.00748 ]
Yao J C , Wang F , Jia K Y , Han B , Zhou J R and Yang H X . 2021a . Device-cloud collaborative learning for recommendation // Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . Singapore, Singapore : ACM: 3865 - 3874 [ DOI: 10.1145/3447548.3467097 http://dx.doi.org/10.1145/3447548.3467097 ]
Yao J C , Zhang S Y , Yao Y , Wang F , Ma J X , Zhang J W , Chu Y F , Ji L , Jia K Y , Shen T , Wu A P , Zhang F D , Tan Z Q , Kuang K , Wu C , Wu F , Zhou J R and Yang H X . 2023 . Edge-cloud polarization and collaboration: a comprehensive survey for AI . IEEE Transactions on Knowledge and Data Engineering , 35 ( 7 ): 6866 - 6886 [ DOI: 10.1109/TKDE.2022.3178211 http://dx.doi.org/10.1109/TKDE.2022.3178211 ]
Yao L W , Huang R H , Hou L , Lu G S , Niu M Z , Xu H , Liang X D , Li Z G , Jiang X and Xu C J . 2021b . FILIP: fine-grained interactive language-image pre-training //Proceedings of the 10th International Conference on Learning Representations. [s.l.]: OpenReview .net: 1 - 21
Yu F X , Zhang W S , Qin Z W , Xu Z R , Wang D , Liu C C , Tian Z and Chen X . 2020 . Heterogeneous federated learning [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2008.06767.pdf https://arxiv.org/pdf/2008.06767.pdf
Zeng A H , Liu X , Du Z X , Wang Z H , Lai H P , Ding M , Yang Z Y , Xu Y F , Zheng W D , Xia X , Tam W L , Ma Z X , Xue Y F , Zhai J D , Chen W G , Liu Z Y , Zhang P , Dong Y X and Tang J . 2022 . GLM-130B: an open bilingual pre-trained model // Proceedings of the 11th International Conference on Learning Representations . Kigali, Rwanda : OpenReview.net: 1 - 56
Zhang J H , Chen S Q , Liu J T and He J X . 2023a . Composing parameter-efficient modules with arithmetic operation // Proceedings of the 37th Conference on Neural Information Processing Systems . New Orleans, USA : OpenReview.net: 1 - 22
Zhang Q R , Chen M H , Bukharin A , Karampatziakis N , He P C , Cheng Y , Chen W Z and Zhao T . 2023b . AdaLoRA: adaptive budget allocation for parameter-efficient fine-tuning [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2303.10512.pdf https://arxiv.org/pdf/2303.10512.pdf
Zhang R R , Han J M , Liu C , Zhou A J , Hu X F , Yan S L , Lu P , Li H S and Qiao Y . 2023c . LlaMA-adapter: efficient fine-tuning of language models with zero-init attention [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2303.16199.pdf https://arxiv.org/pdf/2303.16199.pdf
Zhang S S , Roller S , Goyal N , Artetxe M , Chen M Y , Chen S H , Dewan C , Diab M , Li X , Lin X V , Mihaylov T , Ott M , Shleifer S , Shuster K , Simig D , Koura P S , Sridhar A , Wang T L and Zettlemoyer L . 2022 . OPT: open pre-trained Transformer language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2205.01068v1.pdf https://arxiv.org/pdf/2205.01068v1.pdf
Zhao B R , Cui Q , Song R J , Qiu Y Y and Liang J J . 2022 . Decoupled knowledge distillation // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans, USA : IEEE: 11943 - 11952 [ DOI: 10.1109/CVPR52688.2022.01165 http://dx.doi.org/10.1109/CVPR52688.2022.01165 ]
Zhao M D , Jain S and Song S R . 2023a . RoCo: dialectic multi-robot collaboration with large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2307.04738.pdf https://arxiv.org/pdf/2307.04738.pdf
Zhao Q L , Wang J D , Zhang Y X , Jin Y Q , Zhu K J , Chen H and Xie X . 2023b . CompeteAI: understanding the competition behaviors in large language model-based agents [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2310.17512.pdf https://arxiv.org/pdf/2310.17512.pdf
Zhao Z H , Wallace E , Feng S , Klein D and Singh S . 2021 . Calibrate before use: improving few-shot performance of language models // Proceedings of the 38th International Conference on Machine Learning . Virtual : PMLR: 12697 - 12706
Zhou L W , Palangi H , Zhang L , Hu H D , Corso J and Gao J F . 2020 . Unified vision-language pre-training for image captioning and VQA // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI Press: 13041 - 13049 [ DOI: 10.1609/aaai.v34i07.7005 http://dx.doi.org/10.1609/aaai.v34i07.7005 ]
Zhou X , Lei X Y , Yang C , Shi Y C , Zhang X and Shi J W . 2022 . Handling data heterogeneity in federated learning via knowledge distillation and fusion [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2207.11447.pdf https://arxiv.org/pdf/2207.11447.pdf
Zhou X K , Xu X S , Liang W , Zeng Z and Yan Z . 2021 . Deep-learning-enhanced multitarget detection for end-edge-cloud surveillance in smart IoT . IEEE Internet of Things Journal , 8 ( 16 ): 12588 - 12596 [ DOI: 10.1109/JIOT.2021.3077449 http://dx.doi.org/10.1109/JIOT.2021.3077449 ]
Zhou Y M , Yang Y Z , Ying Q C , Qian Z X and Zhang X P . 2023 . Multimodal fake news detection via clip-guided learning // Proceedings of 2023 IEEE International Conference on Multimedia and Expo . Brisbane, Australia : IEEE: 2825 - 2830 [ DOI: 10.1109/ICME55011.2023.00480 http://dx.doi.org/10.1109/ICME55011.2023.00480 ]
Zhu L C and Yang Y . 2020 . ActBERT: learning global-local video-text representations // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Virtual : IEEE: 8743 - 8752 [ DOI: 10.1109/cvpr42600.2020.00877 http://dx.doi.org/10.1109/cvpr42600.2020.00877 ]
Zhu X Y , Li J , Liu Y , Ma C and Wang W P . 2023a . A survey on model compression for large language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2308.07633.pdf https://arxiv.org/pdf/2308.07633.pdf
Zhu Y F , Niu C Y , Yan Y K , Cao Z J , Jiang H , Lyu C F , Tang S J and Wu F . 2023b . Device-unimodal cloud-multimodal collaboration for livestreaming content understanding // Proceedings of 2023 IEEE International Conference on Data Mining (ICDM) . Shanghai, China : IEEE: #210 [ DOI: 10.1109/ICDM58522.2023.00210 http://dx.doi.org/10.1109/ICDM58522.2023.00210 ]
Zhu Z D , Hong J Y and Zhou J Y . 2021 . Data-free knowledge distillation for heterogeneous federated learning // Proceedings of the 38th International Conference on Machine Learning . Virtual : PMLR: 12878 - 12889
Zou A , Wang Z F , Carlini N , Nasr M , Kolter J Z and Fredrikson M . 2023 . Universal and transferable adversarial attacks on aligned language models [EB/OL]. [ 2023-12-31 ]. https://arxiv.org/pdf/2307.15043v1.pdf https://arxiv.org/pdf/2307.15043v1.pdf
相关文章
相关作者
相关机构