• Online First

    Select All
    Display Type: |
    • Diffusion-model-guided Root Cause Analysis

      Online: September 28,2025 DOI: 10.13328/j.cnki.jos.007473

      Abstract (14) HTML (0) PDF 2.11 M (13) Comment (0) Favorites

      Abstract:Root cause analysis refers to identifying the underlying factors that lead to abnormal failures in complex systems. Causal-based backward reasoning methods, founded on structural causal models, are among the optimal approaches for implementing root cause analysis. Most current causality-driven root cause analysis methods require the prior discovery of the causal structure from data as a prerequisite, making the effectiveness of the analysis heavily dependent on the success of this causal discovery task. Recently, score function-based intervention identification has gained significant attention. By comparing the variance of score function derivatives before and after interventions, this approach detects the set of intervened variables, showing potential to overcome the constraints of causal discovery in root cause analysis. However, mainstream score function-based intervention identification is often limited by the score function estimation step. The analytical solutions used in existing methods struggle to effectively model the real distribution of high-dimensional complex data. In light of recent advances in data generation, this study proposes a diffusion model-guided root cause analysis strategy. Specifically, the proposed method first estimates the score functions corresponding to data distributions before and after the anomaly using diffusion models. It then identifies the set of root cause variables by observing the variance of the first-order derivatives of the overall score function after weighted fusion. Furthermore, to solve the issue of computational overhead raised by the pruning operation, an acceleration strategy is proposed to estimate the score function from the initially trained diffusion model, avoiding the re-training cost of the diffusion model after each pruning operation. Experimental results on simulated and real-world datasets demonstrate that the proposed method accurately identifies the set of root cause variables. Furthermore, ablation studies show that the guidance provided by the diffusion model is critical to the improved performance.

    • Failure Reproducing Test Case Generation Method Based on Large Language Model

      Online: September 28,2025 DOI: 10.13328/j.cnki.jos.007474

      Abstract (13) HTML (0) PDF 2.76 M (9) Comment (0) Favorites

      Abstract:GitHub is one of the most popular open-source project management platforms. Due to the need for team collaboration, GitHub introduced an issue tracking function to facilitate project users in submitting and tracking problems or new feature requests. When resolving issues, contributors of open-source projects typically need to execute failure reproducing test cases to reproduce the problems mentioned in the issue and verify whether the issue has been resolved. However, empirical research conducted on the SWE-bench Lite dataset reveals that nearly 90% of issues are submitted without failure reproducing test cases, leading contributors to write additional failure reproducing test cases when resolving the issues, bringing additional work burden. Existing failure reproducing test case generation methods usually rely on stack trace information, but GitHub issues do not explicitly require such information. Therefore, this study proposes a failure reproducing test case generation method based on a large language model, aimed at automatically generating failure reproducing test cases for GitHub issues, assisting issue contributors in reproducing, understanding, and verifying issues, and improving the efficiency of issue resolution. This method first retrieves diverse code context information related to the issue, including error root functions, import statements, and test case examples, then constructs precise prompts to guide the large language model in generating effective failure reproducing test cases. This study conducts comparative and ablation experiments to verify the effectiveness of this method in generating failure reproducing test cases for GitHub issues.

    • Efficient Privacy-preserving Inference Based on Secret Sharing for Convolutional Neural Network

      Online: September 28,2025 DOI: 10.13328/j.cnki.jos.007475

      Abstract (9) HTML (0) PDF 1.23 M (8) Comment (0) Favorites

      Abstract:In privacy-preserving inference using convolutional neural network (CNN) models, previous research has employed methods such as homomorphic encryption and secure multi-party computation to protect client data privacy. However, these methods typically suffer from excessive prediction time overhead. To address this issue, an efficient privacy-preserving CNN prediction scheme is proposed. This scheme exploits the different computational characteristics of the linear and non-linear layers in CNNs and designs a matrix decomposition computation protocol and a parameterized quadratic polynomial approximation for the ReLU activation function. This enables efficient and secure computation of both the linear and non-linear layers, while mitigating the prediction accuracy loss caused by the approximations. The computations in both the linear and non-linear layers can be performed using lightweight cryptographic primitives, such as secret sharing. Theoretical analysis and experimental results show that, while ensuring security, the proposed scheme improves prediction speed by a factor of 2 to 15, with only about a 2% loss in prediction accuracy.

    • Vulnerability Scanner Enhancement Framework Based on JavaScript Code Analysis

      Online: September 28,2025 DOI: 10.13328/j.cnki.jos.007476

      Abstract (8) HTML (0) PDF 900.19 K (8) Comment (0) Favorites

      Abstract:The black-box vulnerability scanner is an essential tool for Web application vulnerability detection, capable of identifying potential security threats effectively before a Web application is launched, thus enhancing the overall security of the application. However, most current black-box scanners primarily collect the attack surface through user operation simulation and regular expression matching. The simulation of user operations is vulnerable to interception by input validation mechanisms and struggles with handling complex event operations, while regular expression matching is ineffective in processing dynamic content. As a result, the scanner cannot effectively address hidden attack surfaces within JavaScript code or dynamically generated attack surfaces, leading to suboptimal vulnerability detection in some Web applications. To resolve these issues, this study proposes a JavaScript Exposure Scanner (JSEScan), a vulnerability scanner enhancement framework based on JavaScript code analysis. The framework integrates static and dynamic code analysis techniques, bypassing form validation and event-triggering restrictions. By extracting attack surface features from JavaScript code, JSEScan identifies attack surfaces and synchronizes them across multiple scanners, enhancing their vulnerability detection capabilities. The experimental results demonstrate that JSEScan increases coverage by 81.02% to 242.15% compared to using a single scanner and uncovers an additional 239 security vulnerabilities when compared to multiple scanners working concurrently, showing superior attack surface collection and vulnerability detection capabilities.

    • Survey on Vulnerability Detection Techniques for Smart Contract and DeFi Protocol

      Online: September 24,2025 DOI: 10.13328/j.cnki.jos.007413

      Abstract (29) HTML (0) PDF 2.82 M (30) Comment (0) Favorites

      Abstract:As core programmable components of blockchain, smart contracts are responsible for asset management and the execution of complex business logic, forming the foundation of decentralized finance (DeFi) protocols. However, with the rapid advancement of blockchain technology, security issues related to smart contracts and DeFi protocols have become increasingly prominent, attracting numerous attackers seeking to exploit vulnerabilities for illicit gains. In recent years, several major security incidents involving smart contracts and DeFi protocols have highlighted the importance of vulnerability detection research, making it a critical area for security defense. This study systematically reviews existing literature and proposes a comprehensive framework for research on vulnerability detection in smart contracts and DeFi protocols. Specifically, vulnerabilities and detection techniques are categorized and analyzed for both domains. For smart contracts, the study focuses on the application of large language models (LLM) as primary detection engines and their integration with traditional methods. For DeFi protocols, it categorizes and details various protocol-level vulnerabilities and their detection methods, analyzing the strengths and limitations of detection strategies before and after attacks, addressing gaps in existing reviews on DeFi vulnerability detection. Finally, this study summarizes the challenges faced by current detection approaches and outlines future research directions, aiming to provide new insights and theoretical support for the security detection of smart contracts and DeFi protocols.

    • Automatic Migration of AI Source Code Between Frameworks Based on Domain Knowledge Graph

      Online: September 24,2025 DOI: 10.13328/j.cnki.jos.007451

      Abstract (27) HTML (0) PDF 1.30 M (30) Comment (0) Favorites

      Abstract:As the foundation of AI, deep learning frameworks play a vital role in driving the rapid progress of AI technologies. However, due to the lack of unified standards, compatibility across different frameworks remains limited. Faithful model transformation enhances interoperability by converting a source model into an equivalent model in the target framework. However, the large number and diversity of deep learning frameworks, combined with the increasing demand for custom frameworks, lead to high conversion costs. To address this issue, this study proposes an automatic AI source code migration method between frameworks based on a domain knowledge graph. The method integrates domain knowledge graphs and abstract syntax trees to systematically manage migration challenges. First, the source code is transformed into a framework-specific abstract syntax tree, from which general dependency information and operator-specific details are extracted. By applying the operator and parameter mappings stored in the domain knowledge graph, the code is migrated to the target framework, generating equivalent target model code while significantly reducing engineering complexity. Compared with existing code migration tools, the proposed method supports mutual migration among widely used deep learning frameworks, such as PyTorch, PaddlePaddle, and MindSpore. The approach has proven to be both mature and reliable, with part of its implementation open-sourced in Baidu’s official migration tool, PaConvert.

    • Customized Review Generation Integrating Multimodal Information

      Online: September 24,2025 DOI: 10.13328/j.cnki.jos.007465

      Abstract (17) HTML (0) PDF 13.33 M (38) Comment (0) Favorites

      Abstract:With the rapid development of merchant review websites, the volume of content on these websites has increased significantly, making it challenging for users to quickly find valuable reviews. This study introduces a new task, “multimodal customized review generation”. The task aims to generate customized reviews for specific users about products they have not yet reviewed, thus providing valuable insights into these products. To achieve this goal, this study explores a multimodal review generation framework based on a pre-trained language model. Specifically, a multimodal pre-trained language model is employed, which takes product images and user preferences as inputs. The visual and textual features are then fused to generate customized reviews. Experimental results demonstrate that the proposed model is effective in generating high-quality customized reviews.

    • Survey on Graph Contrastive Learning Methods

      Online: September 17,2025 DOI: 10.13328/j.cnki.jos.007417

      Abstract (624) HTML (0) PDF 898.00 K (69) Comment (0) Favorites

      Abstract:Contrastive learning is a self-supervised learning technique widely used in various fields such as computer vision and natural language processing. Graph contrastive learning (GCL) refers to methods that apply contrastive learning techniques to graph data. A review is presented on the basic concepts, methods, and applications of graph contrastive learning. First, the background and significance of GCL, as well as its basic concepts on graph data, are introduced. Then, the mainstream GCL methods are elaborated in detail, including methods with different graph data augmentation strategies, methods with different graph neural network (GNN) encoder structures, and methods with different contrastive loss objectives. Finally, three research ideas for GCL are proposed. Research findings demonstrate that graph contrastive learning is an effective approach for addressing various downstream tasks, including node classification and graph classification.

    • Key Class Identification Based on Dynamic Analysis and Gravitational Formula

      Online: September 17,2025 DOI: 10.13328/j.cnki.jos.007453

      Abstract (39) HTML (0) PDF 2.12 M (61) Comment (0) Favorites

      Abstract:Key classes are a crucial starting point for understanding complex software, contributing to the optimization of documentation and the compression of reverse-engineered class diagrams. Although many effective key class identification methods have been proposed, three major limitations remain: 1) software networks, which are graphs representing software elements and their dependencies, often include elements that are never or rarely executed at runtime; 2) networks constructed through dynamic analysis are frequently incomplete, potentially omitting truly key classes; and 3) most existing approaches consider only the effect of direct coupling between classes, while ignoring the influence of indirect (non-contact) coupling and the diversity of degree distribution among neighboring nodes. To address these issues, a key class identification approach is proposed that integrates dynamic analysis with a gravitational formula. First, a class coupling network (CCN) is constructed using static analysis to represent classes and their coupling relationships. Second, a gravitational entropy (GEN) metric is introduced to quantify class importance by jointly considering direct and indirect couplings in the CCN and the degree-distribution diversity of neighboring nodes. Third, classes are ranked in descending order based on their GEN values to obtain a preliminary ranking. Finally, dynamic analysis is performed to capture actual runtime interactions between classes, which are used to refine the preliminary results. A threshold is applied to filter out non-key classes, producing a final set of candidate key classes. Experimental results on eight open-source Java projects demonstrate that the proposed method significantly outperforms eleven baseline approaches when considering no more than the top 15% (or top 25) of nodes. The integration of dynamic analysis notably improves the performance of the proposed method. Moreover, the choice of weighting schemes for coupling types has a minimal impact on performance, and the overall computational efficiency is acceptable.

    • Code Comment Generation Method Based on Semantic Reranking

      Online: September 17,2025 DOI: 10.13328/j.cnki.jos.007470

      Abstract (44) HTML (0) PDF 1.22 M (50) Comment (0) Favorites

      Abstract:Code comments serve as natural-language descriptions of the source code functionality, helping developers quickly understand the code’s semantics and functionality, thus improving software development and maintenance efficiency. However, writing and maintaining code comments is time-consuming and labor-intensive, often leading to issues such as absence, inconsistency, and obsolescence. Therefore, the automatic generation of comments for source code has attracted significant attention. Existing methods typically use information retrieval techniques or deep learning techniques for automatic code comment generation, but both have their limitations. Some research has integrated these two techniques, but such approaches often fail to effectively leverage the advantages of both methods. To address these issues, this study proposes a semantic reranking-based code comment generation method, SRBCS. SRBCS employs a semantic reranking model to rank and select comments generated by various approaches, thus integrating multiple methods and maximizing their respective strengths in the comment generation process. We compared SRBCS with 11 code comment generation approaches on two subject datasets. Experimental results demonstrate that SRBCS effectively integrates different approaches and outperforms existing methods in code comment generation.

    Prev 1 2 3 Next Last
    Result 10000 Jump to Page GO
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063