2024, 35(1):1-18. DOI: 10.13328/j.cnki.jos.006908
Abstract:Quantum computing is expected to solve many typical and difficult problems in theory. The rapid development of quantum computers in recent years is pushing the theory into practice. However, numerous errors in current hardware can cause incorrect computational results, which severely limit the ability of quantum computers to solve practical problems. Quantum computing system software lies between applications and hardware. In addition, tapping the full potential of the system software in mitigating hardware errors is crucial to realizing practical quantum computing in the near future. As a result, many research works on quantum computing system software have recently emerged. This study classifies them into three categories: compilers, runtime systems, and debuggers. Through an in-depth analysis of these works, the study sorts out the research status of quantum computing system software and reveals their important roles in mitigating hardware errors. This study also looks forward to future research directions.
ZHAN Qi , PAN Sheng-Yi , HU Xing , BAO Ling-Feng , XIA Xin
2024, 35(1):19-37. DOI: 10.13328/j.cnki.jos.006935
Abstract:As the modern software scale expands, software vulnerabilities bring a great threat to the security and reliability of computer systems and software, causing huge damage to people’s production and life. In recent years, as open source software (OSS) is widely used, the vulnerability issues of OSS have received much attention. Vulnerability awareness techniques can effectively help OSS users to identify vulnerabilities at the early stage for timely defense. Different from the vulnerability detection techniques for traditional software, the transparency and cooperativity of OSS vulnerabilities bring great challenges to vulnerability awareness. Therefore, various techniques are proposed by scholars and developers to perceive potential vulnerabilities and risks in OSS from the code and open source community, so as to find OSS vulnerabilities as early as possible and reduce the losses caused by the vulnerabilities. To boost the development of OSS vulnerability awareness techniques, this study conducts a systematic literature review of existing research works. The study selects 45 high-level papers on open source vulnerability awareness techniques, including code-based, open source community discussion-based, and patch-based vulnerability awareness techniques. The results of these papers are systematically summarized. Especially, this study proposes the category of techniques based on the OSS vulnerability life cycle for the first time according to the most recent publications, which supplements and improves the existing taxonomy of vulnerability awareness techniques. Finally, the study discusses the challenges in the field and predicts future research direction.
DONG Wei-Liang , LIU Zhe , LIU Kui , LI Li , GE Chun-Peng , HUANG Zhi-Qiu
2024, 35(1):38-62. DOI: 10.13328/j.cnki.jos.006810
Abstract:As the trusted decentralized application, smart contracts attract widespread attention, whereas their security vulnerabilities threaten the reliability. To this end, researchers employ various advanced technologies (such as fuzz testing, machine learning, and formal verification) to study several vulnerability detection technologies and yield sound effects. This study collects 84 related papers by July 2021 to systematically sort out and analyze existing vulnerability detection technologies of smart contracts. First of all, vulnerability detection technologies are categorized according to their core methodologies. These technologies are analyzed from the aspects of implementation methods, vulnerability categories, and experimental data. Additionally, the differences between domestic and international research in these aspects are compared. Finally, after summarizing the existing technologies, the study discusses the challenges of vulnerability detection technologies and potential research directions.
ZHOU Shu-Lin , LI Shan-Shan , DONG Wei , WANG Ji , LIAO Xiang-Ke
2024, 35(1):63-86. DOI: 10.13328/j.cnki.jos.006835
Abstract:Runtime configuration brings flexibility and customizability to users in the utilization of software systems. However, its enormous scale and complex mechanisms also pose significant challenges. A large number of scholars and research institutions have probed into runtime configuration to improve the availability and adaptability of software systems in complex environments. This study develops an analytical framework of runtime configuration to provide a systematic overview of state-of-the-art research from three different stages, namely configuration analysis and comprehension, configuration defect detection and misconfiguration diagnosis, and configuration utilization. The study also summarizes the limitations and challenges faced by current research and outlines the research trend of runtime configuration, which is of guiding significance for future work.
DU Xue-Ying , LIU Ming-Wei , SHEN Li-Wei , PENG Xin
2024, 35(1):87-117. DOI: 10.13328/j.cnki.jos.006902
Abstract:As an important cornerstone of artificial intelligence, knowledge graphs can extract and represent a priori knowledge from massive data on the Internet, which greatly solves the bottleneck problem of the poor interpretability of cognitive decisions of intelligent systems and plays a key role in the construction and application of intelligent systems. As the application of knowledge graph technology continues to deepen, the knowledge graph completion that aims to solve the problem of the incompleteness of graphs is imminent. Link prediction is the task of predicting the missing entities and relations in the knowledge graph, which is indispensable in the construction and completion of the knowledge graph. The full exploitation of the hidden relations in the knowledge graph and the use of massive entities and relations for computation require the conversion of the symbolic representations of information into the numerical form, i.e., knowledge graph representation learning. Hence, link prediction-oriented knowledge graph representation learning has become a popular research topic in the field of knowledge graphs. This study systematically introduces the latest research progress of link prediction-oriented knowledge graph representation learning methods from the basic concepts of link prediction and representation learning. Specifically, the research progress is discussed in detail in terms of knowledge representation forms and algorithmic modeling methods. The development of the knowledge representation forms is used as a clue to introduce the mathematical modeling of link prediction tasks in the knowledge representation forms of binary relations, multi-relations, and hyper-relations. On the basis of the representation learning modeling, the existing methods are refined into four types of models: translation distance models, tensor decomposition models, traditional deep learning models, and graph neural network models. The implementation methods of each type are described in detail together with representative models for solving link prediction tasks with different relational metrics. The common datasets and criteria for link prediction are then introduced, and on this basis, the link prediction effects of the four types of knowledge representation learning models under the knowledge representation forms of binary relations, multi-relations, and hyper-relations are presented in a comparative analysis. Finally, the future development trends are given in terms of model optimization, knowledge representation forms, and problem scope.
ZHANG Qi-Xun , WU Yi-Fan , YANG Yong , JIA Tong , LI Ying , WU Zhong-Hai
2024, 35(1):118-135. DOI: 10.13328/j.cnki.jos.006827
Abstract:Microservice architectures have been widely deployed and applied, which can greatly improve the efficiency of software system development, reduce the cost of system update and maintenance, and enhance the extendibility of software systems. However, However, microservices are characterized by frequent changes and heterogeneous fusion, which result in frequent faults, fast fault propagation, and great influence. Meanwhile, complex call dependency or logical dependency between microservices makes it difficult to locate and diagnose faults timely and accurately, which poses a challenge to the intelligent operation and maintenance of microservice architecture systems. The service dependency discovery technology identifies and deduces the call dependency or logical dependency between services from data during system running and constructs a service dependency graph, which helps to timely and accurately discover and locate faults and diagnose causes during system running and is conducive to intelligent operation and maintenance requirements such as resource scheduling and change management. This study first analyzes the problem of service dependency discovery in microservice systems and then summarizes the technical status of the service dependency discovery from the perspective of three types of runtime data, such as monitoring data, system log data, and trace data. Then, based on the fault cause location, resource scheduling, and change management of the service dependency graph, the study discusses the application of service dependency discovery technology to intelligent operation and maintenance. Finally, the study discusses how service dependency discovery technology can accurately discover call dependency or logical dependency and use service dependency graph to conduct change management and predicts future research directions.
XU Tong-Tong , LIU Kui , XIA Xin
2024, 35(1):136-158. DOI: 10.13328/j.cnki.jos.006828
Abstract:Software vulnerabilities are known as security defects of computer software systems, and they threaten the completeness, security, and reliability of modern software and application data. Artificial vulnerability management is time-consuming and error-prone. Therefore, in order to better deal with the challenges of vulnerability management, researchers have proposed a variety of automated vulnerability management schemes, among which automated vulnerability repair has attracted wide attention from researchers recently. Automated vulnerability repair consists of three main functions: vulnerability cause localization, patch generation, and patch validation, and it aims to assist developers to repair vulnerabilities. The existing work lacks systematic classification and discussion of vulnerability repair technology. To this end, this study gives a comprehensive insight into the theory, practice, applicable scenarios, advantages, and disadvantages of existing vulnerability repair methods and technologies and writes a research review of automated vulnerability repair technologies, so as to promote the development of vulnerability repair technologies and deepen researchers’ cognition and understanding of vulnerability repair problems. The main contents of the study include: (1) sorting out and summarizing the repair methods of specific and general vulnerabilities according to different vulnerability types; (2) classifying and summarizing different repair methods based on technical principles; (3) summarizing the main challenges of vulnerability repair; (4) looking into future development direction of vulnerability repair.
DOU Hui , ZHANG Ling-Ming , HAN Feng , SHEN Fu-Rao , ZHAO Jian
2024, 35(1):159-184. DOI: 10.13328/j.cnki.jos.006758
Abstract:With the increasingly powerful performance of neural network models, they are widely used to solve various computer-related tasks and show excellent capabilities. However, a clear understanding of the operation mechanism of neural network models is lacking. Therefore, this study reviews and summarizes the current research on the interpretability of neural networks. A detailed discussion is rendered on the definition, necessity, classification, and evaluation of research on model interpretability. With the emphasis on the focus of interpretable algorithms, a new classification method for the interpretable algorithms of neural networks is proposed, which provides a novel perspective for the understanding of neural networks. According to the proposed method, this study sorts out the current interpretable methods for convolutional neural networks and comparatively analyzes the characteristics of interpretable algorithms falling within different categories. Moreover, it introduces the evaluation principles and methods of common interpretable algorithms and expounds on the research directions and applications of interpretable neural networks. Finally, the problems confronted in this regard are discussed, and possible solutions to these problems are given.
ZHOU Tao , GAN Ran , XU Dong-Wei , WANG Jing-Yi , XUAN Qi
2024, 35(1):185-219. DOI: 10.13328/j.cnki.jos.006834
Abstract:As an important technology in the field of artificial intelligence (AI), deep neural networks are widely used in various image classification tasks. However, existing studies have shown that deep neural networks have security vulnerabilities and are vulnerable to adversarial examples. At present, there is no research on the systematic analysis of adversarial example detection of images. To improve the security of deep neural networks, this study, based on the existing research work, comprehensively introduces adversarial example detection methods in the field of image classification. First, the detection methods are divided into supervised detection and unsupervised detection by the construction method of the detector, which are then classified into subclasses according to detection principles. Finally, the study summarizes the problems in adversarial example detection and provides suggestions and an outlook in terms of generalization and lightweight, aiming to assist in AI security research.
ZOU Yue , LAI Jia-Yang , ZHANG Yong-Gang
2024, 35(1):220-235. DOI: 10.13328/j.cnki.jos.006844
Abstract:The integration of machine learning and automatic reasoning is a new trend in artificial intelligence. Constraint satisfaction is a classic problem in artificial intelligence. A large number of scheduling, planning, and configuration problems in the real world can be modeled as constraint satisfaction problems, and efficient solving algorithms have always been a research hotspot. In recent years, many new methods of applying machine learning to solve constraint satisfaction problems have emerged. These methods based on “learn to reason” open up new directions for solving constraint satisfaction problems and show great development potential. They are featured by better adaptability, strong scalability, and online optimization. This study divides the current “learn to reason” methods into three categories including message-passing neural network-based, sequence-to-sequence-based, and optimization-based methods. Additionally, the characteristics of various methods and their solution effects on different problem sets are analyzed in detail. In particular, a comparative analysis is conducted on relevant work involved in each type of method from multiple perspectives. Finally, the constraint solving method based on “learn to reason” is summarized and prospected.
FAN Yi-Fan , ZOU Bo-Wei , XU Qing-Ting , LI Zhi-Feng , HONG Yu
2024, 35(1):236-265. DOI: 10.13328/j.cnki.jos.006913
Abstract:Commonsense question answering is an essential natural language understanding task that aims to solve natural language questions automatically by using commonsense knowledge to obtain accurate answers. It has a broad application prospect in areas such as virtual assistants or social chatbots and contains crucial scientific issues such as knowledge mining and representation, language understanding and computation, and answer reasoning and generation. Therefore, it has received wide attention from industry and academia. This study first introduces the main datasets in commonsense question answering. Secondly, it summarizes the distinctions between different sources of commonsense knowledge in terms of construction methods, knowledge sources, and presentation forms. Meanwhile, the study focuses on the analysis and comparison of the state-of-the-art commonsense question answering models, as well as the characteristic methods fusing commonsense knowledge. Particularly, based on the commonalities and characteristics of commonsense knowledge in different question answering task scenarios, this study establishes a commonsense knowledge classification system containing attribute, semantic, causal, context, abstract, and intention. On this basis, it conducts prospective research on the construction of commonsense knowledge datasets, the collaboration mechanism of perceptual knowledge fusion and pre-trained language models, and corresponding commonsense knowledge pre-classification techniques. Furthermore, the study reports specifically on the performance changes in the above models under cross-dataset migration scenarios and their potential contributions in commonsense answer reasoning. On the whole, this study gives a comprehensive review of existing data and state-of-the-art technologies, as well as a pre-research for the construction of cross-data knowledge systems, technology migration, and generalization, so as to provide references for the further development of theories and technologies while reporting on the existing technologies in the field.
JIANG Chang-Lin , LI Qing , WANG Yu , ZHAO Dan , ZHAO Da-Yi , JIANG Yong , XU Ming-Wei
2024, 35(1):266-287. DOI: 10.13328/j.cnki.jos.006753
Abstract:As a complement and extension of the terrestrial network, the satellite network contributes to the acceleration of bridging the digital divide between different regions and can expand the coverage and service range of the terrestrial network. However, the satellite network features highly dynamic topology, long transmission delay, and limited on-board computing and storage capacity. Hence, various technical challenges, including routing scalability and transmission stability, are encountered in the organic integration of the satellite network and the terrestrial network and the construction of a global space-ground integrated network (SGIN). Considering the research challenges of SGIN, this study describes the international and domestic research progress of SGIN in terms of network architecture, routing, transmission, multicast-based content delivery, etc., and then discusses the research trends.
FAN Lin-Na , LI Cheng-Long , WU Yi-Chao , DUAN Chen-Xin , WANG Zhi-Liang , LIN Hai , YANG Jia-Hai
2024, 35(1):288-308. DOI: 10.13328/j.cnki.jos.006818
Abstract:With the development of Internet of Things (IoT) technology, IoT devices are widely applied in many areas of production and life. However, IoT devices also bring severe challenges to equipment asset management and security management. Firstly, Due to the diversity of IoT device types and access modes, it is often difficult for network administrators to know the IoT device types and operating status in the network. Secondly, IoT devices are becoming the focus of cyber attacks due to their limited computing and storage resources, which makes it difficult to deploy traditional defense measures. Therefore, it is important to acknowledge the IoT devices in the network through device identification and detect anomalies based on the device identification results, so as to ensure the normal operation of IoT devices. In recent years, academia has carried out a lot of research on the above issues. This study systematically reviews the work related to IoT device identification and anomaly detection. In terms of device identification, existing research can be divided into passive identification methods and active identification methods according to whether data packets are sent to the network. The passive identification methods are further investigated according to the identification method, identification granularity, and application scenarios. The study also investigates the active identification methods according to the identification method, identification granularity, and detection granularity. In terms of anomaly detection, the existing work can be divided into detection methods based on machine learning algorithms and rule-matching methods based on behavioral norms. On this basis, challenges in IoT device identification and anomaly detection are summarized, and the future development direction is proposed.
ZHANG Man , YAO Jian-Kang , LI Hong-Tao , DONG Ke-Jun , YAN Zhi-Wei
2024, 35(1):309-332. DOI: 10.13328/j.cnki.jos.006898
Abstract:As critical Internet infrastructure, DNS brings many privacy and security risks due to its plaintext transmission. Many encryption technologies for DNS channel transmission, such as DoH, DoT, and DoQ, are committed to preventing DNS data from leaking or tampering and ensuring the reliability of DNS message sources. Firstly, this study analyzes the privacy and security problems of plaintext DNS from six aspects, including the DNS message format, data storage and management, and system architecture and deployment, and then summarizes the existing related technologies and protocols. Secondly, the implementation principles and the application statuses of the encryption protocols for DNS channel transmission are analyzed, and the performance of each encryption protocol under different network conditions is discussed with multi-angle evaluation indicators. Meanwhile, it discusses the privacy protection effects of the encryption technologies for DNS channel transmission through the limitations of the padding mechanism, the encrypted traffic identification, and the fingerprint-based encryption activity analysis. In addition, the problems and challenges faced by encryption technologies for DNS channel transmission are summarized from the aspects of the deployment specifications, the illegal use of encryption technologies by malicious traffic and its attack on them, the contradiction between privacy and network security management, and other factors affecting privacy and security after encryption. Relevant solutions are also presented. Finally, it summarizes the highlights of future research, such as the discovery of the encrypted DNS service, server-side privacy protection, the encryption between recursive resolvers and authoritative servers, and DNS over HTTP/3.
HOU Jian , LU Hui , LIU Fang-Ai , WANG Xing-Wei , TIAN Zhi-Hong
2024, 35(1):333-355. DOI: 10.13328/j.cnki.jos.006891
Abstract:Network traffic encryption not only protects corporate data and user privacy but also brings new challenges to malicious traffic detection. According to different ways of processing encrypted traffic, encrypted malicious traffic detection technology can be divided into active and passive detection. Active detection technology includes detection after traffic decryption and that based on searchable encryption technology. Its research focuses on privacy protection and detection efficiency improvement, and mainly analyzes the application of trusted execution environments and controllable transmission protocols. Passive detection technology is a method of identifying encrypted malicious traffic without perception for users and without performing any encryption or decryption operations. The research focuses on the selection and construction of features. It analyzes relevant detection methods from three types of features such as side channel features, plaintext features, and raw traffic, and then the experimental evaluation conclusions of relevant models are given. Finally, the feasibility of the research on the countermeasures of encrypted malicious traffic detection is analyzed from the perspectives of obfuscating traffic characteristics, interference learning algorithms, and hiding relevant information.
LUO Yu-Yu , QIN Xue-Di , XIE Yu-Peng , LI Guo-Liang
2024, 35(1):356-404. DOI: 10.13328/j.cnki.jos.006911
Abstract:How to quickly and effectively mine valuable information from massive data to better guide decision-making is an important goal of big data analysis. Visual analysis is an important big data analysis method, and it takes advantage of the characteristics of human visual perception, utilizes visualization charts to present laws contained in complex data intuitively, and supports human-centered interactive data analysis. However, the visual analysis still faces several challenges, such as the high cost of data preparation, high latency of interaction response, high threshold for visual analysis, and low efficiency of interaction modes. To address the above challenges, researchers propose a series of methods to optimize the human-computer interaction mode of visual analysis systems and improve the intelligence of the system by leveraging data management and artificial intelligence techniques. This study systematically sorts out, analyzes, and summarizes these methods and puts forward the basic concept and key technical framework of intelligent data visualization analysis. Then, under the framework, the research progress of data preparation for visual analysis, intelligent data visualization, efficient visual analysis, and intelligent visual analysis interfaces both in China and abroad is reviewed and analyzed. Finally, this study looks forward to the future development trend of intelligent data visualization analysis.
WANG Song-Li , JING Yi-Nan , HE Zhen-Ying , ZHANG Kai , WANG Xiao-Yang
2024, 35(1):405-429. DOI: 10.13328/j.cnki.jos.006916
Abstract:Database management systems are divided into transactional (OLTP) systems and analytical (OLAP) systems according to application scenarios. With the growing demand for real-time data analysis and the increasing popularity of mixed OLTP and OLAP tasks, the industry has begun to focus on database management systems that support hybrid transactional/analytical processing (HTAP). An HTAP database system not only needs to meet the requirements of high-performance transaction processing but also supports real-time analysis for data freshness. Therefore, it poses new challenges to the design and implementation of database systems. In recent years, some prototypes and products with diverse architectures and technologies have emerged in industry and academia. This study reviews the background and development status of HTAP databases and classifies current HTAP databases from the perspective of storage and computing. On this basis, this study summarizes the key technologies used in the storage and computing of HTAP systems from bottom to top. Under this framework, the design ideas, advantages and disadvantages, and applicable scenarios of various systems are introduced. In addition, according to the evaluation benchmarks and metrics of HTAP databases, this study also analyzes the relationship between the design of various HTAP databases and their performance as well as data freshness. Finally, this study combines cloud computing, artificial intelligence, and new hardware technologies to provide ideas for future research and development of HTAP databases.
XU Zhi-Zhen , XU Chen , DING Guang-Yao , CHEN Zi-Hao , ZHOU Ao-Ying
2024, 35(1):430-454. DOI: 10.13328/j.cnki.jos.006917
Abstract:In order to perform knowledge mining and management, information systems need to process various forms of data, including stream data. Stream data have the characteristics of large data scale, fast generation speed, and strong timeliness of the knowledge contained in them. Therefore, it is very important for knowledge management of information systems to develop stream processing technology that supports real-time stream processing applications. Stream processing systems (SPSs) can be traced back to the 1990s, and they have undergone significant development since then. However, current diverse knowledge management needs and the new generation of hardware architectures have brought new challenges and opportunities for SPSs, and a series of technical research on stream processing ensues. This study introduces the basic requirements and development history of SPSs and then analyzes relevant technologies in the SPS field in terms of four aspects: programming interface, execution plan, resource scheduling, and fault tolerance. Finally, this study predicts the research directions and development trends of stream processing technology in the future.
HUANG Chun-Yue , PENG Qi , ZHANG Fu-Xiao , WANG Sheng-Yi , LUO Cheng , ZHANG Yan-Feng , YU Ge
2024, 35(1):455-480. DOI: 10.13328/j.cnki.jos.006822
Abstract:Data replication is an important way to improve the availability of distributed databases. By placing multiple database replicas in different regions, the response speed of local reading and writing operations can be increased. Furthermore, increasing the number of replicas can improve the linear scalability of the read throughput. In view of these advantages, a number of multi-replica distributed database systems have emerged in recent years, including some mainstream systems from the industry such as Google Spanner, CockroachDB, TiDB, and OceanBase, as well as some excellent systems from academia such as Calvin, Aria, and Berkeley Anna. However, these multi-replica databases bring a series of challenges such as consistency maintenance, cross-node transactions, and transaction isolation while providing many benefits. This study summarizes the existing replication architecture, consistency maintenance strategy, cross-node transaction concurrency control, and other technologies. It also analyzes the differences and similarities between several representative multi-replica database systems in terms of distributed transaction processing. Finally, the study builds a cross-region distributed cluster environment on Alibaba Cloud and conducts multiple experiments to study the distributed transaction processing performance of these several representative systems.
LIU Shu-Sen , HE Xiao-Wei , WANG Wen-Cheng , WU En-Hua
2024, 35(1):481-512. DOI: 10.13328/j.cnki.jos.006777
Abstract:Smoothed particle hydrodynamics (SPH) is one key technology for fluid simulation. With the growing demand for applications of SPH fluid simulation technology in production practices, many relevant studies have emerged in recent years, which improve the visual authenticity, efficiency, and stability simulated by physical properties including fluid incompressibility, viscosity, and surface tension. Additionally, some researchers focus on high-quality simulation in complex scenarios and a unified simulation framework with multiple scenarios and materials, thereby enhancing the application efficiency of SPH fluid simulation technology. This study discusses and summarizes related research on SPH fluid simulation technology from the above aspects, and proposes a prospect for the technology.