具身智能决策风险安全研究综述

董诗泉; 方栋梁; 郑尧文; 王允成; 吕世超; 李志; 陈永乐; 孙利民

doi:10.20009/j.cnki.21-1106/TP.2025-0384

小型微型计算机系统 ›› 2026, Vol. 47 ›› Issue (5) : 1245 -1255. DOI: 10.20009/j.cnki.21-1106/TP.2025-0384

计算机网络与信息安全

具身智能决策风险安全研究综述

^1,2

作者信息 +

Survey of Decision-making Risks and Safety in Embodied Artificial Intelligence

^1,2

Author information +

文章历史 +

摘要

随着大语言模型和视觉语言模型的应用,具身智能从规则驱动转向知识驱动,暴露了决策层的语义开放性和推理黑箱问题,带来新的安全风险.现有研究多关注感知鲁棒性或伦理治理,缺乏具身智能决策安全的系统框架.本文将决策脆弱性分为外源威胁和内源威胁,分析了感知、规划与执行链中的风险级联机理,并探讨了对抗扰动、传感器欺骗等典型攻击的影响.总结了形式化约束、可达性验证等防御方法,评估了其在实时性、资源限制和任务复杂度方面的适用性与局限性.最后,结合实际需求,提出了语义物理对齐、跨层协同等待解决问题,并展望端到端可验证框架、先验风险感知等研究方向,为构建可信、可控的具身智能系统提供参考.

Abstract

As large language models and vision-language models become deeply embedded in mobile robots and automated devices,embodied intelligence—an AI paradigm that relies on continual interaction with the environment and a closed-loop coupling of perception,cognition and action—has evolved from rule-driven to knowledge-driven approaches.This shift renders the decision layer,whose semantics are open-ended and whose reasoning process is opaque,increasingly exposed to novel attack surfaces.Existing surveys emphasize perceptual robustness or ethical governance; however,a unified framework that concentrates on the decision-making security of embodied systems is still missing.This paper first categorizes decision vulnerabilities into two sources:exogenous threats (physical attacks,network intrusions,adversarial perturbations) and endogenous threats (model hallucination,policy over-fitting,hardware failure),and explains how risk propagates through the perception-planning-execution chain.We then conduct a systematic analysis of representative attacks—adversarial perturbations,sensor spoofing,backdoor triggers,jailbreak prompts and hallucination amplification—highlighting their cross-modal and cross-temporal manipulation paths as well as their impact on task reliability.Next,we synthesize defense strategies such as safety constraints,reachability verification,multi-modal feedback rejection and risk-sensitive shutdown,evaluating each method with respect to real-time performance,resource constraints and task complexity.Finally,in light of practical deployment requirements,we distill three open challenges:semantic-physical alignment,cross-layer coordination and standardized evaluation.We also outline future directions,including end-to-end verifiable frameworks,prior-risk-aware pre-training and natural-language rule specification.Collectively,this work provides a systematic reference for building trustworthy,controllable and deployable embodied intelligent systems.

关键词

具身智能 / 决策安全 / 大模型 / 深度学习 / 端到端模型

Key words

embodied artificial intelligence / decision security / larger language model / deep learning / end to end model

引用本文

引用格式 ▾

董诗泉, 方栋梁, 郑尧文, 王允成, 吕世超, 李志, 陈永乐, 孙利民. 具身智能决策风险安全研究综述[J]. 小型微型计算机系统, 2026, 47(5): 1245-1255 DOI:10.20009/j.cnki.21-1106/TP.2025-0384

登录浏览全文

4963

注册一个新账户忘记密码

参考文献

[1] Achiam J,Adler S,Agarwal S,et al.Gpt-4 technical report[J].arXiv preprint arXiv:2303.08774,2023.
[2] WANG W S,TAN N,HUANG K,et al.Embodied intelligence systems based on large models:a survey[J].Acta Automatica Sinica,2025,51(1):1-19.
[3] Wake N,Kanehira A,Sasabuchi K,et al.GPT-4V(ision) for robotics:multimodal task planning from human demonstration[J].IEEE Robotics and Automation Letters,2024,9(11):10567-10574.
[4] Lu D,Sun Y,Zhang Z,et al.InternVL-X:advancing and accelerating internVL series with efficient visual token compression[J].arXiv preprint arXiv:2503.21307,2025.
[5] Team G,Georgiev P,Lei V I,et al.Gemini 1.5:unlocking multimodal understanding across millions of tokens of context[J].arXiv preprint arXiv:2403.05530,2024.
[6] Liu Y,Chen W,Bai Y,et al.Aligning cyber space with physical world:a comprehensive survey on embodied AI[J].arXiv preprint arXiv:2407.06886,2024.
[7] Cangelosi A,Bongard J,Fischer M H,et al.Embodied intelligence[M].Springer Handbook of Computational Intelligence,Springer Nature,2015:697-714.
[8] Xing W,Li M,Li M,et al.Towards robust and secure embodied AI:a survey on vulnerabilities and attacks[J].arXiv preprint arXiv:2502.13175,2025.
[9] Turing A M.Computing machinery and intelligence[M].Springer Netherlands,2009.
[10] Liu H,Guo D,Cangelosi A.Embodied intelligence:a synergy of morphology,action,perception and learning[J].ACM Computing Surveys,2025,57(7):1-36.
[11] XU W Y,JI X Y,YAN C,et al.Embodied artificial intelligence security and governance[J].Bulletin of Chinese Academy of Sciences,2025,40(3):429-439.
[12] Radford A,Kim J W,Hallacy C,et al.Learning transferable visual models from natural language supervision[C]//International Conference on Machine Learning,2021:8748-8763.
[13] Ahn M,Brohan A,Brown N,et al.Do as i can,not as i say:grounding language in robotic affordances[J].arXiv preprint arXiv:2204.01691,2022.
[14] Li J,Li D,Xiong C,et al.Blip:bootstrapping language-image pre-training for unified vision-language understanding and generation[C]//International Conference on Machine Learning,2022:12888-12900.
[15] Li J,Selvaraju R,Gotmare A,et al.Align before fuse:vision and language representation learning with momentum distillation[C]//Advances in Neural Information Processing Systems,2021:9694-9705.
[16] Ho J,Ermon S.Generative adversarial imitation learning[C]//Advances in Neural Information Processing Systems,2016:4572-4580.
[17] Puig X,Undersander E,Szot A,et al.Habitat 3.0:a co-habitat for humans,avatars and robots[J].arXiv preprint arXiv:2310.13724,2023.
[18] Brohan A,Brown N,Carbajal J,et al.Rt-1:robotics transformer for real-world control at scale[J].arXiv preprint arXiv:2212.06817,2022.
[19] Schulman J,Wolski F,Dhariwal P,et al.Proximal policy optimization algorithms[J].arXiv preprint arXiv:1707.06347,2017.
[20] Roderick M,MacGlashan J,Tellex S.Implementing the deep q-network[J].arXiv preprint arXiv:1711.07478,2017.
[21] Haarnoja T,Zhou A,Hartikainen K,et al.Soft actor-critic algorithms and applications[J].arXiv preprint arXiv:1812.05905,2018.
[22] Xiang,Zhen,et al.Backdoor chain-of-thought prompting for large language models[J].arXiv preprint arXiv:2401.12242,2024.
[23] Wang X,Pan H,Zhang H,et al.Trojanrobot:backdoor attacks against robotic manipulation in the physical world[J].arXiv e-prints,2024:arXiv:2411.11683.
[24] Jiao R,Xie S,Yue J,et al.Can we trust embodied agents? exploring backdoor attacks against embodied LLM-based decision-making systems[J].arXiv preprint arXiv:2405.20774,2024.
[25] Liu A,Zhou Y,Liu X,et al.Compromising LLM driven embodied agents with contextual backdoor attacks[J].IEEE Transactions on Information Forensics and Security,2025,20:3979-3994,doi:10.1109/TIFS.2025.3555410.
[26] Ji X,Cheng Y,Zhang Y,et al.Poltergeist:acoustic adversarial machine learning against cameras and computer vision[C]//IEEE Symposium on Security and Privacy(SP),2021:160-175.
[27] Jin Z,Ji X,Cheng Y,et al.Pla-lidar:physical laser attacks against lidar-based 3d object detection in autonomous vehicle[C]//IEEE Symposium on Security and Privacy(SP),2023:1822-1839.
[28] Jang J H,Cho M,Kim J,et al.Paralyzing drones via EMI signal injection on sensory communication channels[C]//Network and Distributed System Security Symposium(NDSS),2023,doi:10.14722/ndss.2023.24616.
[29] Robey A,Ravichandran Z,Kumar V,et al.Jailbreaking LLM-controlled robots[J].arXiv preprint arXiv:2410.13691,2024.
[30] Zhang H,Zhu C,Wang X,et al.Badrobot:jailbreaking LLM-based embodied ai in the physical world[J].arXiv preprint arXiv:2407.20242,2024.
[31] Lu X,Huang Z,Li X,et al.POEX:policy executable embodied AI jailbreak attacks[J].arXiv preprint arXiv:2412.16633,2024.
[32] Robey A,Ravichandran Z,Kumar V,et al.Jailbreaking LLM-controlled robots[J].arXiv preprint arXiv:2410.13691,2024.
[33] Tan W,Zhang W,Liu S,et al.True knowledge comes from practice:aligning LLMs with embodied environments via reinforcement learning[J].arXiv preprint arXiv:2401.14151,2024.
[34] Du Y,Watkins O,Wang Z,et al.Guiding pretraining in reinforcement learning with large language models[C]//International Conference on Machine Learning,2023:8657-8677.
[35] Zou A,Wang Z,Carlini N,et al.Universal and transferable adversarial attacks on aligned language models[J].arXiv preprint arXiv:2307.15043,2023.
[36] Zhang T,Wang L,Zhang X,et al.Visual adversarial attack on vision-language models for autonomous driving[J].arXiv preprint arXiv:2411.18275,2024.
[37] Liu S,Chen J,Ruan S,et al.Exploring the robustness of decision-level through adversarial attacks on LLM-based embodied models[C]//Proceedings of the 32nd ACM International Conference on Multimedia,2024:8120-8128.
[38] Chen M,Tu J,Qi C,et al.Towards physically-realizable adversarial attacks in embodied vision navigation[J].arXiv preprint arXiv:2409.10071,2024.
[39] Wang Y,Zhang M,Sun J,et al.Mirage in the eyes:hallucination attack on multi-modal large language models with only attention sink[J].arXiv preprint arXiv:2501.15269,2025.
[40] Huang Q,Dong X,Zhang P,et al.Opera:alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2024:13418-13427.
[41] Yue Z,Zhang L,Jin Q.Less is more:mitigating multimodal hallucination from an eos decision perspective[J].arXiv preprint arXiv:2402.14545,2024.
[42] Zhou Y,Cui C,Yoon J,et al.Analyzing and mitigating object hallucination in large vision-language models[J].arXiv preprint arXiv:2310.00754,2023.
[43] Fang J,Jiang Y,Jiang C,et al.State-of-the-art optical-based physical adversarial attacks for deep learning computer vision systems[J].Expert Systems with Applications,2024,250:123761-123771,doi:10.1016/j.eswa.2024.123761.
[44] Wei H,Tang H,Jia X,et al.Physical adversarial attack meets computer vision:a decade survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):9797-9817.
[45] Sun Y,Huang Y,Wei X.Embodied laser attack:leveraging scene priors to achieve agent-based robust non-contact attacks[C]//Proceedings of the 32nd ACM International Conference on Multimedia,2024:5902-5910.
[46] Kim K,Kim J,Song S,et al.Engineering pupil function for optical adversarial attacks[J].Optics Express,2022,30(5):6500-6518.
[47] Liu Z,Lin F,Ba Z,et al.MagShadow:physical adversarial example attacks via electromagnetic injection[J].IEEE Transactions on Dependable and Secure Computing,2025,22(4):3307-3323.
[48] Li S,Liu F,Cui L,et al.Safe planner:empowering safety awareness in large pre-trained models for robot task planning[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2025:14619-14627.
[49] Yang Z,Raman S S,Shah A,et al.Plug in the safety chip:enforcing constraints for LLM-driven robot agents[C]//IEEE International Conference on Robotics and Automation(ICRA),2024:14435-14442.
[50] Hafez A,Akhormeh A N,Hegazy A,et al.Safe LLM-controlled robots with formal guarantees via reachability analysis[J].arXiv preprint arXiv:2503.03911,2025.
[51] Tan X,Liu B,Bao Y,et al.Towards safe and tustworthy embodied AI:foundations,status,and prospects[EB/OL].https://openreview.net/pdf?id=Eu6Yt21Alv,2025-09-12.
[52] Sun X,Zhang Y,Tang X,et al.TrustNavGPT:modeling uncertainty to improve trustworthiness of audio-guided LLM-based robot navigation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS),2024:8794-8801.
[53] Zhang W,Kong X,Braunl T,et al.Safeembodai:a safety framework for mobile robots in embodied ai systems[J].arXiv preprint arXiv:2409.01630,2024.
[54] Yin S,Pang X,Ding Y,et al.SafeAgentBench:a benchmark for safe task planning of embodied LLM agents[J].arXiv preprint arXiv:2412.13178,2024.
[55] Liu A,Ying Z,Wang L,et al.AGENTSAFE:benchmarking the safety of embodied agents on hazardous instructions[J].arXiv preprint arXiv:2506.14697,2025.

附中文参考文献:
[2] 王文晟,谭宁,黄凯,等.基于大模型的具身智能系统综述[J].自动化学报,2025,51(1):1-19.
[11] 徐文渊,冀晓宇,闫琛,等.具身智能安全治理[J].中国科学院院刊,2025,40(3):429-439.