Causal Spatiotemporal Semantic-Driven Deep Reinforcement Learning Abstraction Modeling Method

doi:10.13328/j.cnki.jos.007354

微信服务号

微信订阅号

Home > Archive>Volume 36, Issue 8, 2025 >0-0. DOI:10.13328/j.cnki.jos.007354

PDF HTML XML Export Cite reminder

Causal Spatiotemporal Semantic-Driven Deep Reinforcement Learning Abstraction Modeling Method
DOI:
                        10.13328/j.cnki.jos.007354
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

With the rapid development of Intelligent Cyber-Physical Systems (ICPS), intelligent technologies are increasingly being applied in intelligent components such as perception, decision-making, and control. Among these, deep reinforcement learning (DRL) has been widely used in the control components of ICPS due to its efficiency in handling complex dynamic environments. However, the openness of the operating environment and the complexity of ICPS require DRL to explore a highly dynamic state space during the learning process, which can lead to inefficiencies and inadequate generalization in decision-making. A common solution to this problem is to abstract a large-scale fine-grained Markov Decision Process (MDP) into a smaller-scale coarse-grained MDP, thereby simplifying the model’s computational complexity and improving the solution efficiency. However, these methods have yet to address how to ensure the semantic consistency between the temporal-spatial semantic information of the original states, the clustered abstract system space, and the real system space. To solve the above problems, this paper proposes a causal temporal-spatial semantic-based abstraction modeling method for deep reinforcement learning. First, causal temporal-spatial semantics reflecting the distribution of value changes over time and space are introduced, and based on this, a two-stage semantic abstraction is performed on the states to construct an abstract MDP model for the deep reinforcement learning process. Next, abstraction optimization techniques are employed to refine the abstract model, reducing the semantic errors between the abstract states and the corresponding specific states. Finally, extensive experiments were conducted using cases such as lane-keeping, adaptive cruise control, and intersection crossing, and the model was evaluated and analyzed with the PRISM verifier. The results demonstrate that our proposed abstraction modeling technique performs well in terms of the model’s abstraction capability, accuracy, and semantic equivalence.

Reference

Cited by

Get Citation

田丽丽,杜德慧,聂基辉,陈逸康,李荥达.因果时空语义驱动的深度强化学习抽象建模方法.软件学报,2025,36(8):0

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 26,2024
Revised:October 14,2024
Adopted:
Online: December 10,2024
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History