Volume 35,Issue 8,2024 Table of Contents

1 Preface

XIANG Jian-Wen , CHEN Ting , WANG Hao-Yu , LUO Xia-Bu , YANG Min

2024, 35(8):3551-3552. DOI: 10.13328/j.cnki.jos.007125

[Abstract](354) [HTML](583) [PDF 492.84 K](971)

Abstract:

2 Dual Offline Anonymous E-payment Scheme for Mobile Devices Based on TEE and SE

YANG Bo , FENG Wei , QIN Yu , ZHANG Yan-Chao , TONG Dong

2024, 35(8):3553-3576. DOI: 10.13328/j.cnki.jos.007115

[Abstract](662) [HTML](606) [PDF 3.22 M](2106)

Abstract:
In recent years, many major economies have paid close attention to central bank digital currency (CBDC). As an optional attribute of CBDC, dual offline transaction is considered to have great practical value under the circumstances for payment without network connection. This study proposes OAPM for CBDC, a dual offline anonymous e-payment scheme for mobile devices user as either a payer or a payee based on trusted execution environment (TEE) and secure element (SE). OAPM is suitable for mobile devices with limited resources. It allows payer to safely pay digital currency to payees without networking, without disclosing personal privacy information to payees and commercial banks, and without linking the payment behaviors of payers. Meanwhile, it allows payees’ devices to be offline. Regulators, such as central banks, can identify anonymous payers if necessary. The scheme satisfies a number of important attributes of digital currency transactions, including correctness, unlinkability, traceability, non-frame-up, confidentiality, authenticity, anti-double-cross, controllable anonymity, etc. Finally, the prototype system is implemented and the possible parameters are tested. Security analysis and experimental results show that the scheme can meet the actual needs of CBDC offline transaction of mobile users from both security and efficiency.

3 UEFI Fuzz Testing Method Based on Heuristic Reverse Analysis

LIN Xin-Kang , GU Kuang-Yu , ZHAO Lei

2024, 35(8):3577-3590. DOI: 10.13328/j.cnki.jos.007116

[Abstract](671) [HTML](588) [PDF 1.92 M](2222)

Abstract:
As a next-generation firmware interface standard, the unified extensible firmware interface (UEFI) has been widely used in modern computer systems. However, UEFI vulnerabilities have also brought serious security threats. To avoid security problems caused by UEFI vulnerabilities as much as possible, vulnerability detection is needed, in which, fuzzing under third-party security testing scenarios is mainly used. Nevertheless, the absence of symbolic information affects the efficiency of testing. This study proposes a heuristic UEFI reverse analysis method, which recovers the symbolic information within the firmware, improves fuzz testing, and implements a prototype system, ReUEFuzzer. Through testing 525 EFI files from four manufacturers, the effectiveness of the reverse analysis method is demonstrated. ReUEFuzzer can enhance the function test coverage and has identified an unknown vulnerability during the testing process, which has been reported to China National Vulnerability Database and the Common Vulnerabilities and Exposures (CVE) system. Empirical evidence shows that the method presented in this paper is valid for UEFI vulnerability detection and can provide a certain degree of security guarantee for UEFI.

4 FirmDep: Embedded Application Rehosting Assisted with Dynamic Analysis

WU Hua-Mao , JIANG Mu-Hui , ZHOU Ya-Jin , LI Jin-Ku

2024, 35(8):3591-3609. DOI: 10.13328/j.cnki.jos.007117

[Abstract](501) [HTML](551) [PDF 2.58 M](1925)

Abstract:
Through providing a virtual environment modeled from embedded devices, firmware rehosting enables dynamic analysis on embedded device firmware. Existing full-emulation firmware hosting solutions can only preventatively fix known hardware and software dependencies but cannot address undetected dependencies during the rehosting process. This study proposes FirmDep, an embedded application rehosting solution assisted with dynamic analysis. During the rehosting process, FirmDep records the execution trace and system state of the embedded application to be analyzed. If FirmDep fails to rehost the application, FirmDep extracts information and recover system states from the execution trace, then uses several algorithms to identify and arbitrate the unresolved dependency problems. The prototype system of FirmDep is implemented based on PANDA and angr, and it is tested with embedded Web applications from 217 real-world firmware images. The results show that FirmDep can effectively identify unresolved dependencies of embedded application and improve the success rate of rehosting.

5 Network-side Alert Prioritization Method Based on Multivariate Data Fusion

WANG Wei-Jing , CHEN Jun-Jie , YANG Lin , HOU De-Jun , WANG Xing-Kai , WU Fu-Di , ZHANG Run-Zi , WANG Zan

2024, 35(8):3610-3625. DOI: 10.13328/j.cnki.jos.007118

[Abstract](442) [HTML](574) [PDF 2.30 M](1956)

Abstract:
The network security monitoring systems deployed on network nodes generate a large number of network-side alerts every day, causing the security engineers to face significant pressure to lose sensitivity to high-risk alerts and fail to detect network attacks in time. Due to the complexity and variability of cyber attacks and the limitation of network-side alert information, existing alert prioritization/ classification methods for IT operations are unsuitable for network-side alerts. Thus, network-side alert prioritization (NAP), the first network-side alert prioritization method, is proposed based on multivariate data fusion. NAP first designs a multi-strategy context encoder based on source IP address and destination IP address to capture the context information of network-side alerts. And then, NAP designs a text encoder based on the attention-based bidirectional GRU model and the ChineseBERT model to learn the semantic information of network-side alerts from the text data such as alert messages. Finally, NAP builds a ranking model to obtain the alert ranking values and then ranks the high-risk alerts with cyber attack intention in the front according to their descending order to optimize the network-side alert management process. The experiments on three groups of network attack and defense data from NSFOCUS show that NAP can achieve effective and stable prioritization results, and significantly outperforms the compared methods. For example, the average NDCG@k (kÎ[1,10]) (i.e., normalized discounted cumulative gain of the first 1 to 10 ranking results) ranges from 0.893 1 to 0.958 3, and outperforms the state-of-the-art method more than 64.73%. Besides, NAP has been applied to a real-world network-side alert dataset from Tianjin University, further confirming its practicability.

6 DGA Domain Name Detection Method Based on Double Branch Feature Extraction and Adaptive Capsule Network

YANG Hong-Yu , ZHANG Tao , ZHANG Liang , CHENG Xiang , HU Ze

2024, 35(8):3626-3646. DOI: 10.13328/j.cnki.jos.007119

[Abstract](472) [HTML](564) [PDF 3.26 M](2050)

Abstract:
The existing domain name detection methods for domain generation algorithm (DGA) generally have the characteristics of weak feature extraction ability and high feature information compression ratio, which lead to feature information loss, feature structure destruction, and poor domain name detection performance. Aiming at the above problems, a DGA domain name detection method based on double branch feature extraction and adaptive capsule network is proposed. Firstly, the original samples are reconstructed through sample cleaning and dictionary construction, and the reconstructed sample set is generated. Secondly, the reconstructed samples are processed by a double branch feature extraction network, in which the local features of domain name are extracted by using a sliced pyramid network, the global features of domain name are extracted by using a transformer, and the features at different levels are fused by using lightweight attention. Then, an adaptive capsule network is used to calculate the importance coefficient of the domain name feature map, convert domain name text features into vector domain name features, and calculate the domain name classification probability based on text features by feature transfer. Meanwhile, multilayer perceptron is used to process domain name statistical features to calculate the domain name classification probability based on statistical features. Finally, domain name detection is performed by combining the domain name classification probabilities from two different perspectives. A large number of experiments show that the method proposed in this study achieves leading detection results in DGA domain name detection and DGA domain name family detection and classification, where the F1-score in DGA domain name detection increased by 0.76% to 5.57%, and the F1-score (macro average) in DGA domain name family detection classification increased by 1.79% to 3.68%.

7 Reinforcement-learning-based Adversarial Attacks Against Vulnerability Detection Models

CHEN Si-Ran , WU Jing-Zheng , LING Xiang , LUO Tian-Yue , LIU Jia-Yu , WU Yan-Jun

2024, 35(8):3647-3667. DOI: 10.13328/j.cnki.jos.007120

[Abstract](891) [HTML](690) [PDF 4.04 M](2502)

Abstract:
Deep learning-based code vulnerability detection models have gradually become an important method for detecting software vulnerabilities due to their advantages of high detection efficiency and accuracy, and play an important role in the code auditing service of the code hosting platform GitHub. However, deep neural networks have been proved to be susceptible to the interference of adversarial attacks, which leads to the risk of deep learning-based vulnerability detection models being attacked and reducing the detection accuracy. Therefore, building adversarial attacks against vulnerability detection models can not only uncover the security flaws of such models, but also help to evaluate the robustness of the models, and then improve the performance of the models through corresponding methods. However, the existing counter-attack methods for vulnerability detection models rely on generalized code transformation tools, and do not propose targeted code perturbation operations and decision algorithms, so it is difficult to generate effective counter-attack samples, and the legitimacy of the counter-attack samples relies on manual checking. To address the above problems, a reinforcement learning adversarial attack method for vulnerability detection model is proposed. The method firstly designs a series of semantically constrained and vulnerability-preserving code perturbation operations as a set of perturbations; secondly, the code samples with vulnerabilities are used as inputs, and the reinforcement learning model is used to select specific sequences of perturbation operations; finally, the code samples are used to search for potential locations of perturbations according to the types of nodes in the syntax tree, and then code transformations are carried out, thus generating the counteracting samples. Based on SARD and NVD, two experimental datasets with a total of 14 278 code samples are constructed, and four vulnerability detection models with different characteristics are trained as attack targets. For each target model, a reinforcement learning network is trained to counter the attack. The results show that the attack method leads to a 74.34% decrease in the recall of the models and a 96.71% success rate, which is an average increase of 68.76% compared to the baseline method. The experiment proves that the current vulnerability detection model has the risk of being attacked, and further research is needed to improve the robustness of the model.

8 Compliance Detection Method for Mobile Application Privacy Policy Statement

WANG Yin , FAN Ming , TAO Jun-Jie , LEI Jing-Yi , JIN Wu-Xia , HAN De-Qiang , LIU Ting

2024, 35(8):3668-3683. DOI: 10.13328/j.cnki.jos.007121

[Abstract](420) [HTML](493) [PDF 2.51 M](2132)

Abstract:
The privacy policy statement of a mobile application serves as a crucial document that must be disclosed to users before collecting their information. However, current privacy policy statements face various issues, such as missing key disclosure items, omitting information collection purposes, and using vague descriptions. With an increasing number of legal provisions, the requirements for privacy policy statements vary, making compliance verification more burdensome. This study proposes a multi-label classification method for mobile application privacy policy statements. This method compares the requirements of four core laws and regulations regarding privacy policy statements, summarizes and organizes 31 categories of core item labels and features. Under this label system, the study designs and implements a classification model for privacy policy statement sentences, which achieves a 94% accuracy rate in item classification. Using this model, compliance verification was conducted in Android applications and mini-program scenarios, revealing issues such as missing items (79%), omitted purposes (63%), and vague descriptions (94%) in privacy policy statements.

9 Underground Application Collection Method Based on Spiking Traffic Analysis

CHEN Pei , HONG Geng , WU Meng-Ying , CHEN Jin-Song , DUAN Hai-Xin , YANG Min

2024, 35(8):3684-3697. DOI: 10.13328/j.cnki.jos.007122

[Abstract](456) [HTML](475) [PDF 2.10 M](1588)

Abstract:
In recent years, with the rise of the mobile Internet, underground mobile applications primarily involved in scams, gambling, and pornography have become more rampant, requiring effective control measures. Currently, there is a lack of research on underground applications by researchers. Due to the continuous crackdown by law enforcement agencies on traditional distribution channels for these applications, the existing collection methods based on search engines and app stores have proven to be ineffective. The lack of large-scale and representative datasets of real-world underground applications has become a major constraint for in-depth research. Therefore, this study aims to address the challenge of collection of large-scale real-world underground applications, providing data support for a comprehensive in-depth analysis of these applications and their ecosystem. A method is proposed to capture underground applications based on traffic analysis. By focusing on the key distribution channels of underground applications and leveraging their characteristics of mutation and accompanying traffic, underground applications can be discovered in the propagation stage. In the test, the proposed method successfully obtained 3 439 application download links and 3 303 distinct applications. Among these apps, 91.61% of the samples were labeled as malware by antivirus engine, while 98.14% of the samples were zero-days. The results demonstrate the effectiveness of the proposed method in the collection of underground applications.

10 Privacy-preserving Graph Neural Network Recommendation System Based on Negative Database

ZHAO Dong-Dong , XU Hu , PENG Si-Yun , ZHOU Jun-Wei

2024, 35(8):3698-3720. DOI: 10.13328/j.cnki.jos.007124

[Abstract](714) [HTML](476) [PDF 3.85 M](2308)

Abstract:
Graph data is a kind of data composed of nodes and edges, which models the entities as the nodes, nodes may be connected by edges, and edge indicates a relationship between entities. By analyzing and mining these data, a lot of valuable information can be revealed. Meanwhile, it also brings risks of privacy information disclosure for every entity in the graph. To address this issue, a graph data publishing method is proposed based on the negative database (NDB). This method transforms the structural characteristics of the graph data into the encoding format of a negative database. Based on this, a generation method for perturbed graphs (NDB-Graph) is designed. Since NDB is a privacy-preserving technique that does not explicitly store the original data and is difficult to reverse, the published graph data ensures the security of the original graph data. Besides, due to the high efficiency of graph neural network in relation feature processing in graph data, it is widely used in various task processing modeling on graph data, such as recommendation system. a graph neural network recommendation system is also proposed based on NDB technology to protect the privacy of graph data for each user. Compared with publishing method PBCN, the proposed method outperforms it in most cases in experiments on the Karate and Facebook datasets. For example, on Facebook datasets, the smallest L1-error of degree distribution is only 6, which is about 2.6% lower than the PBCN method under the same privacy level, the worst case is about 1 400, which is about 46.5% lower than the PBCN method under the same privacy level. The experiment of collaborative filtering based on LightGCN also demonstrates that the proposed privacy protection method has high precision.

11 Hexagonal Loop Tiling for Jacobi Computation Optimization Method

QU Bin , LIU Song , ZHANG Zeng-Yuan , MA Jie , WU Wei-Guo

2024, 35(8):3721-3738. DOI: 10.13328/j.cnki.jos.006945

[Abstract](585) [HTML](342) [PDF 2.48 M](1538)

Abstract:
Jacobi computation is a kind of stencil computation, which has been widely applied in the field of scientific computing. The performance optimization of Jacobi computation is a classic topic, where loop tiling is an effective optimization method. The existing loop tiling methods mainly focus on the impact of tiling on parallel communication and program locality and fail to consider other factors such as load balancing and vectorization. This study analyzes and compares several tiling methods based on multi-core computing architecture and chooses an advanced hexagonal tiling as the main method to accelerate Jacobi computation. For tile size selection, this study proposes a hexagonal tile size selection algorithm called Hexagon_TSS by comprehensively considering the impact of tiling on load balancing, vectorization efficiency, and locality. The experimental results show that the L1 data cache miss rate can be reduced to 5.46% of original serial program computation in the best case by Hexagon_TSS, and the maximum speedup reaches 24.48. The proposed method also has excellent scalability.

12 Anchor-based Unsupervised Cross-modal Hashing

HU Peng , PENG Xi , PENG De-Zhong

2024, 35(8):3739-3751. DOI: 10.13328/j.cnki.jos.006960

[Abstract](516) [HTML](347) [PDF 5.15 M](1540)

Abstract:
Thanks to the low storage cost and high retrieval speed, graph-based unsupervised cross-modal hash learning has attracted much attention from academic and industrial researchers and has been an indispensable tool for cross-modal retrieval. However, the high computational complexity of graph structures prevents its application in large-scale multi-modal applications. This study mainly attempts to solve two important challenges facing graph-based unsupervised cross-modal hash learning: 1) How to efficiently construct graphs in unsupervised cross-modal hash learning? 2) How to handle the discrete optimization in cross-modal hash learning? To address such two problems, this study presents anchor-based cross-modal learning and a differentiable hash layer. To be specific, the study first randomly samples some image-text pairs from the training set as anchor sets and uses the anchor sets as the agent to compute the graph matrix of each batch of data. The graph matrix is used to guide cross-modal hash learning, thus remarkably reducing the space and time cost; second, the proposed differentiable hash layer directly adopts binary coding for computation during network forward propagation and produces gradient to update the network without continuous-value relaxation during backpropagation, thus embracing better hash encoding performance. Finally, the study introduces cross-modal ranking loss to consider the ranking results in the training process and improve the cross-modal retrieval accuracy. To verify the effectiveness of the proposed algorithm, the study compares the algorithm with 10 cross-modal hash algorithms on three general data sets.

13 Survey on Testing of Deep Learning Frameworks

MA Xiang-Yue , DU Xiao-Ting , CAI Qing , ZHENG Yang , HU Zheng , ZHENG Zheng

2024, 35(8):3752-3784. DOI: 10.13328/j.cnki.jos.007059

[Abstract](935) [HTML](577) [PDF 6.59 M](2834)

Abstract:
As big data and computing power rapidly develop, deep learning has made significant breakthroughs and rapidly become a field with numerous practical application scenarios and active research topics. In response to the growing demand for the development of deep learning tasks, deep learning frameworks have arisen. Acting as an intermediate component between application scenarios and hardware platforms, deep learning frameworks facilitate the development of deep learning applications, enabling users to efficiently construct diverse deep neural network (DNN) models, and deeply adapt to various computing hardware, meeting the computational needs across different computing architectures and environments. Any issues that arise within deep learning frameworks, which serve as the fundamental software in the realm of artificial intelligence, can have severe consequences. Even a single bug in the code can trigger widespread failures within models built upon the framework, thereby posing a serious threat to the safety of deep learning systems. As a review exclusively focuses on the testing of deep learning frameworks, this study initially introduces the developmental history and basic architectures of deep learning frameworks. Subsequently, by systematically examining 55 academic papers directly related to the testing of deep learning frameworks, the study systematically analyzes and summarizes bug characteristics, key technologies for testing, and methods based on various input forms for testing. The study explores how to combine key technologies to address research problems. Lastly, it summarizes the unresolved difficulties in the testing of deep learning frameworks and provides insights into promising research directions for the future. This study can offer valuable references and guidance to individuals involved in the research field of deep learning framework testing, ultimately promoting the sustained development and maturity of deep learning frameworks.

14 Survey on Few-shot for Malware Detection

LIU Hao , TIAN Zhi-Hong , QIU Jing , LIU Yuan , FANG Bin-Xing

2024, 35(8):3785-3808. DOI: 10.13328/j.cnki.jos.007080

[Abstract](994) [HTML](701) [PDF 6.14 M](2784)

Abstract:
Malware detection is a hotspot of cyberspace security research, such as Windows malware detection and Android malware detection. With the development of machine learning and deep learning, some outstanding algorithms in the fields of image recognition and natural language processing have been applied to malware detection. These algorithms have shown excellent learning performance with a large amount of data. However, there are some challenging problems in malware detection that have not been solved effectively. For instance, conventional learning methods cannot achieve effective detection based on a few novel malware. Therefore, few-shot learning (FSL) is adopted to solve the few-shot for malware detection (FSMD) problems. This study extracts the problem definition and the general process of FSMD by the related research. According to the principle of the method, FSMD methods are divided into methods based on data augmentation, methods based on meta-learning, and hybrid methods combining multiple technologies. Then, the study discusses the characteristics of each FSMD method. Finally, the background, technology, and application prospects of FSMD are proposed.

15 Code-search-oriented Function Multigraph Embedding

XU Yang , CHEN Xiao-Jie , TANG De-You , HUANG Han

2024, 35(8):3809-3823. DOI: 10.13328/j.cnki.jos.006940

[Abstract](549) [HTML](302) [PDF 5.65 M](1480)

Abstract:
How to improve the accuracy of matching between natural language query input and highly structured programming language source code is a fundamental concern in code search. Accurate extraction of code features is one of the key challenges to improving matching accuracy. The semantics expressed by statements in codes is not only relevant to themselves but also to their contexts. The structural model of the code provides rich contextual information for understanding code functions. This study proposes a code search method based on function multigraph embedding. By using an early fusion strategy, the study fuses the data dependencies of code statements into a control flow graph and constructs a function multigraph to represent the code. The multigraph explicitly expresses the dependency relationships of indirect predecessor and successor nodes that are lacking in the control flow graph through data dependencies and enhances the contextual information of statement nodes. At the same time, in view of the edge heterogeneity of the multigraph, this study uses the relational graph convolutional network to extract the features of the code from the function multigraph. Experiments on a public dataset show that the proposed method can improve the MRR by more than 5% compared with the existing methods based on code text and structural models. The ablation experiments also show that the control flow graph contributes more to the search accuracy than the data dependence graph.

16 Software Change Prediction Based on Hybrid Graph Representation

YANG Xin-Yue , LIU An , ZHAO Lei , CHEN Lin , ZHANG Xiao-Fang

2024, 35(8):3824-3842. DOI: 10.13328/j.cnki.jos.006947

[Abstract](446) [HTML](330) [PDF 2.85 M](1489)

Abstract:
Software change prediction, aimed at identifying change-prone modules, can help software managers and developers allocate resources efficiently and reduce maintenance overhead. Extracting effective features from the code plays a vital role in the construction of accurate prediction models. In recent years, researchers have shifted from traditional hand-crafted features to semantic features with powerful representation capabilities for prediction. They extracted semantic features from abstract syntax tree (AST) node sequences to build models. However, existing studies have ignored the structural information in the AST and the rich semantic information in the code. How to extract the semantic features of the code is still a challenging problem. For this reason, the study proposes a change prediction method based on hybrid graph representation. To start with, the model combines AST, control flow graph (CFG), data flow graph (DFG), and other structural information to construct the program graph representation of the code. Then, it uses the graph neural network to learn the semantic features of the program graph and the features obtained to predict change-proneness. The model can integrate various semantic information to represent the code better. The effectiveness of the proposed method is verified by comparing it with the latest change prediction methods on various change datasets.

17 Survey on Quantum Machine Learning

WANG Jian , ZHANG Rui , JIANG Nan

2024, 35(8):3843-3877. DOI: 10.13328/j.cnki.jos.007042

[Abstract](971) [HTML](883) [PDF 8.87 M](4727)

Abstract:
In recent years, machine learning has always been a research hotspot, and has been applied to various fields with an important role played. However, as the data amount continues to increase, the training time of machine learning algorithms is getting longer. Meanwhile, quantum computers demonstrate a powerful computing ability. Therefore, researchers try to solve the problem of long machine learning training time, which leads to the emergence of quantum machine learning. Quantum machine learning algorithms have been proposed, including quantum principal component analysis, quantum support vector machine, and quantum deep learning. Additionally, experiments have proven that quantum machine learning algorithms have a significant acceleration effect, leading to a gradual upward trend in research on quantum machine learning. This study reviews research on quantum machine learning algorithms. First, the fundamental concepts of quantum computing are introduced. Then, five quantum machine learning algorithms are presented, including quantum supervised learning, quantum unsupervised learning, quantum semi-supervised learning, quantum reinforcement learning, and quantum deep learning. Next, related applications of quantum machine learning are demonstrated with the algorithm experiments provided. Finally, the relevant summary and prospect of future study are discussed.

18 Over-parameterized Graph Neural Network Towards Robust Graph Structure Defending

CHU Xu , MA Xin-Yu , LIN Yang , WANG Xin , WANG Ya-Sha , ZHU Wen-Wu , MEI Hong

2024, 35(8):3878-3896. DOI: 10.13328/j.cnki.jos.007065

[Abstract](317) [HTML](285) [PDF 3.62 M](1077)

Abstract:
Graph data is ubiquitous in real-world applications, and graph neural networks (GNNs) have been widely used in graph data analysis. However, the performance of GNNs can be severely impacted by adversarial attacks on graph structures. Existing defense methods against adversarial attacks generally rely on low-rank graph structure reconstruction based on graph community preservation priors. However, existing graph structure adversarial defense methods cannot adaptively seek the true low-rank value for graph structure reconstruction, and low-rank graph structures are semantically mismatched with downstream tasks. To address these problems, this study proposes the over-parameterized graph neural network (OPGNN) method based on the implicit regularization effect of over-parameterization. In addition, it formally proves that this method can adaptively solve the low-rank graph structure problem and also proves that over-parameterized residual links on node deep representations can effectively address semantic mismatch. Experimental results on real datasets demonstrate that the OPGNN method is more robust than existing baseline methods, and the OPGNN framework is notably effective on different graph neural network backbones such as GCN, APPNP, and GPRGNN.

19 Review on Temporal Graph Neural Networks for Financial Risk Prediction

SONG Ling-Yun , MA Zhuo-Yuan , LI Zhan-Huai , SHANG Xue-Qun

2024, 35(8):3897-3922. DOI: 10.13328/j.cnki.jos.007087

[Abstract](1174) [HTML](754) [PDF 9.66 M](4456)

Abstract:
Financial risk prediction plays an important role in financial market regulation and financial investment, and has become a research hotspot in artificial intelligence and financial technology in recent years. Due to the complex investment, supply and other relationships among financial event entities, existing research on financial risk prediction often employs various static and dynamic graph structures to model the relationship among financial entities. Meanwhile, convolutional graph neural networks and other methods are adopted to embed relevant graph structure information into the feature representation of financial entities, which enables the representation of both semantic and structural information related to financial risks. However, previous reviews of financial risk prediction only focus on studies based on static graph structures, but ignore the characteristics that the relationship among entities in financial events will change dynamically over time, which reduces the accuracy of risk prediction results. With the development of temporal graph neural networks, increasingly more studies have begun to pay attention to financial risk prediction based on dynamic graph structures, and a systematic and comprehensive review of these studies will help learners foster a complete understanding of financial risk prediction research. According to different methods to extract temporal information from dynamic graphs, this study first reviews three different neural network models for temporal graphs. Then, based on different graph learning tasks, it introduces the research on financial risk prediction in four areas, including stock price trend risk prediction, loan default risk prediction, fraud transaction risk prediction, and money laundering and tax evasion risk prediction. Finally, the difficulties and challenges facing the existing temporal graph neural network models in financial risk prediction are summarized, and potential directions for future research are prospected.

20 Research Progress and Trend of Temporal Knowledge Graph Representation and Reasoning

WANG Yu-Han , CHEN Zi-Yang , ZHAO Xiang , TAN Zhen , XIAO Wei-Dong , CHENG Xue-Qi

2024, 35(8):3923-3951. DOI: 10.13328/j.cnki.jos.007093

[Abstract](1198) [HTML](1033) [PDF 9.25 M](4358)

Abstract:
As a research hotspot in artificial intelligence in recent years, knowledge graphs have been applied to many fields in reality. However, with the increasingly diversified application scenarios of knowledge graphs, people gradually find that static knowledge graphs which do not change with time cannot fully adapt to the scenarios of high-frequency knowledge update. To this end, researchers propose the concept of temporal knowledge graphs containing temporal information. This study organizes all existing temporal knowledge graph representation and reasoning models and summarizes and constructs a theoretical framework for these models. Then, on this basis, it briefly introduces and analyzes the current research progress of temporal representation reasoning, and carries out the future trend prediction to help researchers develop and design better models.

21 Survey on Network Congestion Control Algorithms

JIANG Wan-Chun , LI Hao-Yang , CHEN Han-Yu , WANG Jie , WANG Jian-Xin , RUAN Chang

2024, 35(8):3952-3979. DOI: 10.13328/j.cnki.jos.007045

[Abstract](815) [HTML](645) [PDF 3.64 M](4741)

Abstract:
Network congestion control algorithms are the key factor indetermining network transport performance. In recent years, the spreading network, the growing network bandwidth, and the increasing user requirements for network performance have brought challenges to the design of congestion control algorithms. To adapt to different network environments, many novel design ideas of congestion control algorithms have been proposed recently, which have greatly improved the performance of networks and user experience. This study reviews innovative congestion control algorithm design ideas and classifies them into four major categories: reservation scheduling, direct measurement, machine learning-based learning, and iterative detection. It introduces the corresponding representative congestion control algorithms, and further compares and analyzes the advantages and disadvantages of various congestion control ideas and methods. Finally, the study looks forward to future development direction on congestion control to inspire research in this field.

22 Survey on Key Techniques of Encrypted Computing in Fully Encrypted Databases

BI Shu-Ren , NIU Ze-Ping , LI Guo-Liang , LI Qi

2024, 35(8):3980-4010. DOI: 10.13328/j.cnki.jos.007095

[Abstract](718) [HTML](1170) [PDF 7.69 M](2727)

Abstract:
In recent years, with the popularity of cloud services, increasingly more enterprises and individuals have stored their data in cloud databases. However, enjoying the convenience of cloud services also brings about data security issues. One of the crucial problems is data confidentiality protection, which is to safeguard the sensitive data of users from being spied on or leaked. Fully encrypted databases have emerged to face this challenge. Compared with traditional databases, fully encrypted databases can encrypt data in the entire lifecycle of data transmission, storage, and computation, thereby ensuring data confidentiality. Currently, there are still many challenges in encrypting data while supporting all SQL functionalities and maintaining high performance. This study comprehensively investigates the key techniques of encrypted computing in fully encrypted databases, summarizes the techniques according to the types, and compares and sums up them based on functionality, security, and performance. Firstly, it introduces the architecture of fully encrypted databases, including crypto-based architecture, trusted execution environment (TEE)-based architecture, and hybrid architecture. Then, the key techniques of each architecture are summarized. Finally, the challenges and opportunities of current research are discussed, with some open problems provided for future research.

微信服务号

微信订阅号

Volume 35,Issue 8,2024 Table of Contents

Current Issue

Volume

Issue