Safe Reinforcement Learning Algorithm and Its Application in Intelligent Control for CPS

doi:10.13328/j.cnki.jos.006588

微信服务号

微信订阅号

Home > Archive>Volume 33, Issue 7, 2022 >2538-2561. DOI:10.13328/j.cnki.jos.006588

PDF HTML XML Export Cite reminder

Safe Reinforcement Learning Algorithm and Its Application in Intelligent Control for CPS
DOI:
                        10.13328/j.cnki.jos.006588
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

The problem of safe controller design for cyber-physical systems (CPS) is a hot research topic. The existing safe controller design based on formal methods has problems such as excessive reliance on system models and poor scalability. Intelligent control based on deep reinforcement learning can handle high-dimensional nonlinear complex systems and uncertain systems, and is becoming a very promising CPS control technology, but it lacks safety guarantees. This study addresses the safety issues of reinforcement learning control by focusing on a case study of a typical industrial oil pump control system, and carries out research in designing new safe reinforcement learning algorithm and applying the algorithm in intelligent control scenario. First, the safe reinforcement learning problem of the industrial oil pump is formulated, and simulation environment of the oil pump is built. Then, by designing the structure and activation function of the output layer, the neural network type oil pump controller is constructed to satisfy the linear inequality constraints of the oil pump switching time. Finally, in order to better balance the safety and optimality control objectives, a new safe reinforcement learning algorithm is designed based on the augmented Lagrange multiplier method. Comparative experiment on the industrial oil pump shows that the controller generated by the proposed algorithm surpasses existing algorithms in the same category, both in safety and optimality. In further evaluation, the neural network controllers generated in this study pass rigorous formal verification with probability of 90%. Meanwhile, compared with the theoretically optimal controller, neural network controllers achieve a loss of optimal objective value as low as 2%. The method proposed in this study is expected to be extended to more application scenarios, and the case study scheme is expected to be referenced by other researchers in the field of intelligent control and formal verification.

Reference

Cited by

Get Citation

赵恒军,李权忠,曾霞,刘志明.安全强化学习算法及其在CPS智能控制中的应用.软件学报,2022,33(7):2538-2561

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:September 05,2021
Revised:October 14,2021
Adopted:
Online: January 28,2022
Published: July 06,2022

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History