Search Advanced Search
Total result 69
    Select All
    Display Type:|
    • Causal-spatiotemporal-semantics-driven Abstraction Modeling Method for Deep Reinforcement Learning

      2025, 36(8):1-18.DOI: 10.13328/j.cnki.jos.007354

      Keywords:deep reinforcement learning (DRL)abstraction modelingcausal spatiotemporal semanticsintelligent cyber-physical system (ICPS)Markov decision process (MDP)
      Abstract (677)HTML (0)PDF 7.01 M (186)Favorites

      Abstract:With the rapid advancement of intelligent cyber-physical system (ICPS), intelligent technologies are increasingly utilized in components such as perception, decision-making, and control. Among these, deep reinforcement learning (DRL) has gained wide application in ICPS control components due to its effectiveness in managing complex and dynamic environments. However, the openness of the operating environment and the inherent complexity of ICPS necessitate the exploration of highly dynamic state spaces during the learning process. This often results in inefficiencies and poor generalization in decision-making. A common approach to address these issues is to abstract large-scale, fine-grained Markov decision processes (MDPs) into smaller-scale, coarse-grained MDPs, thus reducing computational complexity and enhancing solution efficiency. Nonetheless, existing methods fail to adequately ensure consistency between the spatiotemporal semantics of the original states, the abstracted system space, and the real system space. To address these challenges, this study proposes a causal spatiotemporal semantic-driven abstraction modeling method for deep reinforcement learning. First, causal spatiotemporal semantics are introduced to capture the distribution of value changes across time and space. Based on these semantics, a two-stage semantic abstraction process is applied to the states, constructing an abstract MDP model for the deep reinforcement learning process. Subsequently, abstraction optimization techniques are employed to fine-tune the abstract model, minimizing semantic discrepancies between the abstract states and their corresponding detailed states. Finally, extensive experiments are conducted on scenarios including lane-keeping, adaptive cruise control, and intersection crossing. The proposed model is evaluated and analyzed using the PRISM verifier. The results indicate that the proposed abstraction modeling technique demonstrates superior performance in abstraction expressiveness, accuracy, and semantic equivalence.

    • Verification for Neural Network Based on Error Divide and Conquer

      2024, 35(5):2307-2324.DOI: 10.13328/j.cnki.jos.006967

      Keywords:neural network (NN)model abstractionsymbolic propagationlinear approximationdivide and conquer
      Abstract (445)HTML (976)PDF 10.88 M (1544)Favorites

      Abstract:With the rapid development of neural network technology, neural networks have been widely applied in safety-critical fields such as autonomous driving, intelligent manufacturing, and medical diagnosis. Thus, it is crucial to ensure the trustworthiness of neural networks. However, due to the vulnerability of neural networks, slight perturbation often leads to wrong results. Therefore, it is vital to use formal verification methods to ensure the safety and trustworthiness of neural networks. Current verification methods for neural networks are mainly concerned with the accuracy of the analysis, while apt to ignore operational efficiency. When verifying the safety properties of complex networks, the large-scale state space may lead to problems such as infeasibility or unsolvability. To reduce the state space of neural networks and improve the verification efficiency, this study presents a formal verification method for neural networks based on divide and conquer considering over-approximation errors. The method uses the reachability analysis technique to calculate the upper and lower bounds of nonlinear nodes and uses an improved symbolic linear relaxation method to reduce over-approximation errors during the boundary calculation of nonlinear nodes. The constraints of nodes are refined by calculating the direct and indirect effects of their over-approximation errors. Thereby, the original verification problem is split into a set of sub-problems whose mixed integer linear programming (MILP) formulation has a smaller number of constraints. The method is implemented as a tool named NNVerifier, whose properties are verified and evaluated through experiments on four ReLU-based fully-connected benchmark networks trained on three classic datasets. The experimental results show that the verification efficiency of the NNVerifier is 37.18% higher than that of the existing complete verification methods.

    • Software Change Prediction Based on Hybrid Graph Representation

      2024, 35(8):3824-3842.DOI: 10.13328/j.cnki.jos.006947

      Keywords:software change predictiongraph neural network (GNN)abstract syntax tree (AST)control flow graph (CFG)data flow graph (DFG)
      Abstract (482)HTML (603)PDF 2.85 M (1680)Favorites

      Abstract:Software change prediction, aimed at identifying change-prone modules, can help software managers and developers allocate resources efficiently and reduce maintenance overhead. Extracting effective features from the code plays a vital role in the construction of accurate prediction models. In recent years, researchers have shifted from traditional hand-crafted features to semantic features with powerful representation capabilities for prediction. They extracted semantic features from abstract syntax tree (AST) node sequences to build models. However, existing studies have ignored the structural information in the AST and the rich semantic information in the code. How to extract the semantic features of the code is still a challenging problem. For this reason, the study proposes a change prediction method based on hybrid graph representation. To start with, the model combines AST, control flow graph (CFG), data flow graph (DFG), and other structural information to construct the program graph representation of the code. Then, it uses the graph neural network to learn the semantic features of the program graph and the features obtained to predict change-proneness. The model can integrate various semantic information to represent the code better. The effectiveness of the proposed method is verified by comparing it with the latest change prediction methods on various change datasets.

    • Obfuscation-resilient Android Malware Detection Based on Graph Convolutional Networks

      2023, 34(6):2526-2542.DOI: 10.13328/j.cnki.jos.006848

      Keywords:Android malwareobfuscation-resilientfunction call graphabstract APIgraph convolutional network (GCN)
      Abstract (1599)HTML (3017)PDF 1.97 M (4768)Favorites

      Abstract:Since the release of Android, it has become the most widely used mobile phone operating system in the world due to its advantages such as open source, rich hardware, and diverse application markets. At the same time, the explosive growth of Android devices and Android applications (app for short) has made it a target of 96% of mobile malware. Among current detection methods, the direct extraction of simple program features, ignoring the program semantics is fast but less accurate, and the conversion of semantic information of programs into graph models for analysis improves accuracy but has high runtime overhead and is not very scalable. To address these challenges, the program semantics of an App is distilled into a function call graph and the API call is abstracted to convert the call graph into a simpler graph. Finally, these vectors are fed into a graph convolution network (GCN) model to train a classifier with triplet loss (i.e., SriDroid). After conducting experimental analysis on 20 246 Android apps, it is found that SriDroid can achieve 99.17% malware detection accuracy with sound robustness.

    • Collaborative Verification Method of Uninterpreted Programs

      2023, 34(7):3116-3133.DOI: 10.13328/j.cnki.jos.006860

      Keywords:collaborative verificationuninterpreted programcounterexample-guided abstraction refinementtrace abstractionreuse
      Abstract (709)HTML (1944)PDF 2.82 M (3208)Favorites

      Abstract:The verification of an uninterpreted program is undecidable in general. Recently, a decidable fragment (called coherent) of uninterpreted programs is discovered and the verification of coherent uninterpreted programs is PSPACE complete. Based on the results of coherent uninterpreted programs, a trace abstraction-based verification method in CEGAR (counterexample-guided abstraction refinement) style is proposed for general uninterpreted programs, and is very effective. Although that, the verification of uninterpreted programs sometimes needs many refinements. Especially when verify multiple programs with this method, the verifications of different programs are independent of each other and has high complexity. However, it is observed that those abstract models of infeasible counter-example traces are reusable and can benefit from each other’s verification when the programs to be verified are similar. In this work, a collaborative verification framework is proposed that accumulates the abstract models of infeasible traces during the programs’ verification. When a new program is to be verified, the program abstraction is refined with the accumulated abstract model first to wipe off those infeasible traces to improve the verification efficiency. Besides, an optimized congruence-based trace abstraction method is also proposed that compacting the states during the verification to enlarge the scope of the abstractions of the infeasible traces. The collaborative verification framework and the optimized trace abstraction method have been implemented, achieving on average 2.70x and 1.49x speedups on two representative benchmarks.

    • Survey on FPGA-based High-performance Programmable Data Plane

      2023, 34(11):5330-5354.DOI: 10.13328/j.cnki.jos.006669

      Keywords:programmable data plane (PDP)field-programmable gate array (FPGA)programming abstractionhigh-level synthesis (HLS)
      Abstract (901)HTML (3073)PDF 5.90 M (4008)Favorites

      Abstract:The programmable data plane (PDP), allowing offloading and accelerating network applications, creates revolutionary development opportunities for such applications. Also, it promotes the innovation and evolution of the network by supporting the rapid implementation and deployment of new protocols and services. It has thus been a research hotspot in the field of the network in recent years. With its general computing architecture, rich on-chip resources and extended interfaces, field-programmable gate array (FPGA) provides a variety of implementations of PDP for a wider range of application scenarios. It also offers the possibility to explore more general PDP abstraction. Therefore, FPGA-based PDP (F-PDP) has been widely concerned by the academic and industrial communities. In this study, F-PDP abstraction is described by category. Then, the research progress of key technologies for building network applications with F-PDP is outlined, and programmable network devices based on F-PDP are presented. After that, the application research based on F-PDP in recent years is reviewed in detail from three aspects: improving network performance, building a network measurement framework, and deploying network security applications. Finally, the possible future research trends of F-PDP are discussed.

    • Code Smell Detection Approach Based on Pre-training Model and Multi-level Information

      2022, 33(5):1551-1568.DOI: 10.13328/j.cnki.jos.006548

      Keywords:code smelldeep learningpre-trained modelabstract syntax treemulti-level information
      Abstract (1803)HTML (2823)PDF 6.41 M (5361)Favorites

      Abstract:Most of the existing code smell detection approaches rely on code structure information and heuristic rules, while pay little attention to the semantic information embedded in different levels of code, and the accuracy of code smell detection approaches is not high. To solve this problem, this study proposes a novel approach DeepSmell based on a pre-trained model and multi-level metrics. Firstly, the static analysis tool is used to extract code smell instances and multi-level code metric information in the source program and mark these instances. Secondly, the level information that relate to code smells in the source code are parsed and obtained through the abstract syntax tree. The textual information composed of the level information is combined with code metric information to generate the data set. Finally, text information is converted into word vectors using the BERT pre-training model. The GRU-LSTM model is applied to obtain the potential semantic relationship among the identifiers, and the CNN model is combined with attention mechanism to code smell detection. The experiment tested four kinds of code smells including feature envy, long method, data class, and god class on 24 open source programs such as JUnit, Xalan, and SPECjbb2005. The results show that DeepSmell improves the average recall and F1 by 9.3% and 10.44% respectively compared with existing detection methods, and maintains a high level of precision at the same time.

    • Compiler Fuzzing Test Case Generation with Feed-forward Neural Network

      2022, 33(6):1996-2011.DOI: 10.13328/j.cnki.jos.006565

      Keywords:software defectcompiler fuzzingdeep learningfeed-forward neural networkabstract syntax network
      Abstract (1945)HTML (3008)PDF 1.69 M (4633)Favorites

      Abstract:Compiler fuzzing is one of the commonly used techniques to test the functionality and safety of compilers. The fuzzer produces grammatically correct test cases to test the deep parts of the compiler. Recently, recurrent neural networks-based deep learning methods have been introduced to the test case generation process. Aiming at the problems of insufficient grammatical accuracy and low generation efficiency when generating test cases, a method for generating compiler fuzzing test cases is proposed based on feed-forward neural networks, and the prototype tool FAIR is designed and implemented. Different from the method based on token sequence learning, FAIR extracts code fragments from the abstract syntax tree, and uses a self-attention-based feed-forward neural network to capture the grammatical associations between code fragments. After learning a generative model of the programming language, fair automatically produce diverse test cases. Experimental results show that FAIR is superior to its competitors in terms of grammatical accuracy and generation efficiency of generating test cases. The proposed method has significantly improved the ability to detect compiler software defects, and has successfully detected 20 software defects in GCC and LLVM. In addition, the method has soundportability. The simple ported FAIR-JS has detected 2 defects in the JavaScript engine.

    • Cross-project Defect Prediction Method Using Adversarial Learning

      2022, 33(6):2097-2112.DOI: 10.13328/j.cnki.jos.006571

      Keywords:cross-project defect predictiongenerative adversarial network (GAN)bag-of-words modelabstract syntax tree (AST)
      Abstract (2032)HTML (3032)PDF 1.57 M (4787)Favorites

      Abstract:Cross-project defect prediction (CPDP) has become an important research direction in data mining of software engineering, which uses the defective codes of other projects to build prediction models and solves the problem of insufficient data in the process of model construction. Nevertheless, there is difference in data distribution between the code files of source and target projects, which leads to poor cross-project prediction results. Based on the adversarial learning idea of generative adversarial network (GAN), under the action of discriminator, the distribution of target project features can be changed to make it close to the distribution of source project features, so as to improve the performance of cross-project defect prediction. Specifically, the process of the proposed abstract continuous GAN (AC-GAN) method consists of two stages:Data processing and model construction. First, the source and target project codes are converted into the form of abstract syntax trees (ASTs), and then the ASTs are traversed in a depth-first manner to derive the token sequences. The continuous bag-of-words model (CBOW) is used to generate word vectors, and the token sequences are transformed into numeric vectors based on the word vector table. Second, the processed numeric vectors are fed into a GAN structure-based model for feature extraction and data migration. Finally, a binary classifier is used to determine whether the target project code files are defective or not. The AC-GAN method conducted comparison experiments on 15 sets of source-target project pairs, and the experimental results demonstrate the effectiveness of this method.

    • Multi-path Back-propagation Method for Neural Network Verification

      2022, 33(7):2464-2481.DOI: 10.13328/j.cnki.jos.006585

      Keywords:neural network verificationsymbolic propagationabstract interpretationmulti-path back-propagation
      Abstract (1371)HTML (2876)PDF 2.00 M (39278)Favorites

      Abstract:Symbolic propagation methods based on linear abstraction play a significant role in neural network verification. This study proposes the notion of multi-path back-propagation for these methods. Existing methods are viewed as using only a single back-propagation path to calculate the upper and lower bounds of each node in a given neural network, being specific instances of the proposed notion. Leveraging multiple back-propagation paths effectively improves the accuracy of this kind of method. For evaluation, the proposed method is quantitatively compared using multiple back-propagation paths with the state-of-the-art tool DeepPoly on benchmarks ACAS Xu, MNIST, and CIFAR10. The experiment results show that the proposed method achieves significant accuracy improvement while introducing only a low extra time cost. In addition, the multi-path back-propagation method is compared with the Optimized LiRPA based on global optimization, on the dataset MNIST. The results show that the proposed method still has an accuracy advantage.

    Prev123456
    Page 7 Result 69 Jump toPageGO

You are the first2035328Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063