CHEN Jin-Bao , ZHANG Yu , LI Qing-Wei , DING Bo-Yao
2024, 35(6):2585-2607. DOI: 10.13328/j.cnki.jos.007096 CSTR:
Abstract:The Go programming language, also known as Golang, has become popular with developers in recent years due to its simple syntax, native support for concurrency, and automatic memory management. This language expects that developers do not need to know whether variables or objects are allocated on the stack or in the heap. The escape analysis of the Go compiler determines the allocation location, and then the garbage collector automatically recycles unreachable heap objects. Go’s escape analysis must correctly determine the allocation location of the object to ensure the memory state correctness. However, escape analysis related problems frequently occur in the Go community at present, potentially causing fatal problems such as program crashes, and there is currently a lack of research on this aspect. To effectively detect whether the code generated by the compiler has illegal memory references that may cause runtime crashes and fill the research gap, this study conducts abstract modeling on the Go program and proposes two rules for verifying the validity of store instructions. Based on these two rules, it overcomes the challenges of lacking high-level semantics in Go binaries and inconvenient access to runtime information and designs a lightweight analysis tool DBI-Go. DBI-Go adopts static analysis plus dynamic binary instrumentation and is implemented based on Pin, a dynamic binary analysis framework. Meanwhile, DBI-Go can identify illegal store instructions in Go binaries. Evaluation results show that DBI-Go can detect all known escape-related issues in the Go community, and also discover an issue that is previously unknown to the Go community. Finally, this issue has been confirmed. The applications in actual projects show that DBI-Go can assist developers in finding bugs in escape analysis algorithms. Evaluation results also show that the measures adopted by DBI-Go can reduce the false positive rate, and the extra runtime overhead brought by DBI-Go in 93.3% of the cases is less than twice the original. Additionally, DBI-Go can be adapted to different versions of Go without modifying Go’s compilation and runtime, therefore yielding wide applicability.
SHEN Tian-Qi , WANG Xi-Zao , BIN Xiang-Rong , BU Lei
2024, 35(6):2608-2630. DOI: 10.13328/j.cnki.jos.007100 CSTR:
Abstract:Pointer analysis is a core and fundamental technology for software compiler optimization and bug detection. Existing classic pointer analysis frameworks such as Doop will transform the programs to be analyzed and analysis algorithms into Datalog evaluation problems like too large program size and solve them. As a result, the analysis time overhead of a single solution can be high, and the program analysis overhead can hardly be afforded especially in situations where programs are frequently changed and released. In recent years, as a technology that effectively reemploys existing analysis results and improves analysis efficiency under frequent code changes, incremental analysis has caught increasing attention. However, since current incremental pointer analysis techniques are often designed for specific algorithms, the supported pointer analysis options are limited and their usability is significantly restricted. To this end, this study designs and implements Differential Doop (DDoop), an incremental pointer analysis framework based on Differential Datalog evaluation. DDoop implements incremental input fact generation and automatic rewriting for incremental analysis rules, expressing incremental analysis problems of multi-version programs as Differential Datalog evaluation problems. Finally, a mature Differential Datalog solution engine like DDlog can be fully utilized to achieve end-to-end incremental pointer analysis, maximizing compatibility and reuse of existing pointer analysis implementations in Doop and providing transparent support for incrementalization. Additionally, experimental evaluation of DDoop is conducted on widely adopted real-world programs. The results show that compared to the non-incremental Doop framework, DDoop has a significant performance advantage while highly compatible with a variety of pointer analysis rules existing in Doop.
GAO Meng , ZHAO Jia-Cheng , CUI Hui-Min , FENG Xiao-Bing
2024, 35(6):2631-2647. DOI: 10.13328/j.cnki.jos.007097 CSTR:
Abstract:Register binding is a fundamental optimization problem in high-level synthesis, primarily aiming to minimize the number of employed registers while ensuring circuit functionality. Conventional approaches attempt to apply register allocation algorithms from compilers to register binding, but they often overlook the inherent disparities between the allocation and binding problems. Finally, this introduces additional resource constraints during the binding and utilizes compilation optimization techniques that are unsuitable for digital circuit design, causing resource waste. To this end, this study transforms the register binding problem into a continuous multicoloring one and proposes a heuristic solving method that combines the bit width and vertex degrees. This method provides a more fine-grained characterization and modeling of variables in information such as the bit width and live intervals, which further reduces register resource overhead compared to existing methods, without the insertion of additional instructions. Meanwhile, a comparison is conducted between the proposed algorithm and two typical algorithms. Experimental results demonstrate that the proposed algorithm yields a theoretically optimal solution in 96.72% of the test cases in the MiBench benchmark, improving 31.5% and 25.1% over other methods respectively. In the Rosetta benchmark, this algorithm consistently exhibits the optimal solution across all test cases, outperforming the other two methods by 7.41% and 7.39% respectively.
FANG Yan-Fei , LI Yan-Bing , DONG En-Ming , WANG Yun-Fei , LIU Qi
2024, 35(6):2648-2667. DOI: 10.13328/j.cnki.jos.007098 CSTR:
Abstract:The on-chip memory hierarchy of Sunway many-core processors is an important structure to alleviate the many-core “memory access wall”. The SPM structure and on-chip RMA communication mechanism completely managed by software bring many opportunities for improving application performance but also pose great challenges for development optimization and porting of applications. To fully explore the hierarchical features of on-chip memory, improve application performance, and reduce the burden of user programming optimization, this study proposes a compiler optimization method that integrates multi-level memory access and communication. This method first designs a fusion compiler directive to transfer high-level information of the program to the compiler. Secondly, a compiler optimization revenue model is built and an iterative solution framework of a heuristic loop optimization scheme is designed. Meanwhile, the compiler completes the solution and code transformation of the loop optimization scheme. DMA and RMA batch data transmission operations are generated by compilation, batch buffer core data with high access latency from lower storage hierarchy spaces into higher storage hierarchy spaces with low access latency. Optimization experiments and analysis are conducted on three typical test cases, and the results show that the program performance optimized by this method is comparable to manual optimization, and significantly improves compared to the unoptimized version.
ZHANG Hong-Bin , ZHOU Xu-Lin , XING Ming-Jie , WU Yan-Jun , ZHAO Chen
2024, 35(6):2668-2686. DOI: 10.13328/j.cnki.jos.007102 CSTR:
Abstract:Deep learning compilers have been widely employed with the rapid development of deep learning models and hardware architectures. At present, the compilation optimization and tuning methods of deep learning models mainly rely on high-performance operator libraries and automatic compiler tuning. However, facing various target operators and adaptation requirements of several hardware platforms, high-performance operator libraries should conduct multiple implementations for different architectures. Additionally, existing auto-tuning schemes face challenges in substantial search overheads and interpretability. To this end, this study proposes AutoConfig, an automatic configuration mechanism for deep learning compilation optimization. Targeting different deep learning workloads and multiple hardware platforms, AutoConfig builds interpretable performance analysis models, conducts a thorough assessment via static information extraction and dynamic overhead measurement, and automates algorithm selection and configuration tuning for code generation. The key innovation of this study is combining the optimization analysis model and a configurable code generation strategy, which ensures a performance acceleration effect and reduces repeated development overheads with the simplified tuning process. Furthermore, this study integrates AutoConfig into a deep learning compiler Buddy Compiler, builds analysis models for convolution and matrix multiplication optimization, and evaluates the optimization on multiple SIMD hardware platforms. Experimental results indicate that AutoConfig effectively completes parameter configuration and algorithm selection in the code generation strategy. Additionally, compared with the codes by manual or automatic optimization, the codes generated by AutoConfig can yield comparable performance without both the repeated manual tuning implementation overheads and auto-tuning search overheads.
XIE Wen-Bing , TIAN Xue , QI Feng-Bin , WU Cheng-Gang , WANG Jun , LUO Qiao-Ling
2024, 35(6):2687-2723. DOI: 10.13328/j.cnki.jos.007099 CSTR:
Abstract:With the rapid development of information technology, a variety of new processor architectures have emerged. The emergence of new architectures brings opportunities for the diversification of processors and meanwhile poses great challenges, which require the compatible operation of existing software to ensure a rich software ecosystem. However, it is difficult to compile large amounts of ecological software from source code compilation in a short time. As a technology that migrates executable code directly from the binary level, binary translation supports cross-platform software compatible operation, which not only expands the software ecosystem but also reduces the coupling between applications and hardware. In recent years, the research on binary translation has made great progress. To summarize the existing achievements and shortcomings, this study first introduces the classification of binary translation technology and typical binary translation systems, then analyzes and summarizes the instruction translation methods, key issues, and optimization techniques, and expounds on the core application fields of binary translation technology. Finally, a prospect is provided for the potential research directions of binary translation technology.
CHEN Wen-Jie , YANG Qi-Liang , JIANG Zi-Yan , XING Jian-Chun , ZHOU Qi-Zhen , ZOU Rong-Wei , FENG Bo-Wei
2024, 35(6):2724-2752. DOI: 10.13328/j.cnki.jos.007101 CSTR:
Abstract:Swarm intelligence systems realize group-level application tasks by information interaction of individual neighbors, and have sound robustness and flexibility. Meanwhile, most developers struggle to describe distributed and parallel individual interaction mechanisms. Some high-level languages allow users to program parallel swarm intelligence computing tasks in a serial mindset and from a global system perspective, without considering low-level interaction details such as communication protocols and data distribution. However, the huge semantic gap between user-oriented, globally declarative swarm intelligence system applications and individual parallel execution logic makes the compilation process complex and application development inefficient. Thus, this study proposes a compilation system and its supporting tools to support the conversion of high-level swarm intelligence system applications into secure and efficient distributed implementations. By parallel information identification, computing division, and interactive information generation, the compilation system compiles the swarm intelligence application program for global and serial programming into parallel object code for individual execution, and thus users do not have to understand the complex interaction mechanism among individuals. Additionally, a standardized intermediate representation of the compilation system is designed to convert complex swarm intelligence computing tasks into a standardized semantic module sequence composed of swarm intelligence operators and input and output variables, which represents source program information in a platform-independent form and shields the heterogeneity of target hardware platforms. The system is deployed and tested in a case platform of swarm intelligence systems. The results show that the compilation system can compile swarm intelligence applications into platform-executable object code and improve the application development efficiency, and its generated code has better performance than existing compilers in a series of benchmarks.
XIE Rui-Lin , CUI Zhan-Qi , CHEN Xiang , ZHENG Li-Wei
2024, 35(6):2753-2774. DOI: 10.13328/j.cnki.jos.006836 CSTR:
Abstract:Autonomous driving software based on deep neural network (DNN) has become the most popular solution. Like traditional software, DNN can also produce incorrect output or unexpected behaviors, and DNN-based autonomous driving software has caused serious accidents, which seriously threaten life and property safety. Therefore, how to effectively test DNN-based autonomous driving software has become an urgent problem. Since it is difficult to predict and understand the behavior of DNNs, traditional software testing methods are no longer applicable. Existing autonomous driving software testing methods are implemented byadding pixel-level perturbations to original images or modifying the whole image to generate test data. The generated test data are quite different from the real images, and the perturbation-based methods are difficult to be understood. To solve the above problem, this study proposes a test data generation method, namely interpretability-analysis-based test data generation (IATG). Firstly, it uses the interpretation method for DNNs to generate visual explanations of decisions made by autonomous driving software and chooses objects in the original images that have significant impacts on the decisions. Then, it generates test data by replacing the chosen objects with other objects with the same semantics. The generated test data are more similar to the real image, and the process is more understandable. As an important part of the autonomous driving software’s decision-making module, the steering angle prediction model is used to conduct experiments. Experimental results show that the introduction of the interpretation method effectively enhances the ability of IATG to mislead the steering angle prediction model. Furthermore, when the misleading angle is the same, the test data generated by IATG are more similar to the real image than DeepTest; IATG has a stronger misleading ability than semSensFuzz, and the interpretation analysis based important object selection method of IATG can effectively improve the misleading ability of semSensFuzz.
YANG Lan-Xin , ZHANG He , XU Jin-Wei , ZHANG Yi-Fan , WANG Zi-Kuan , ZHOU Xin , LI Jing-Yue , RONG Guo-Ping
2024, 35(6):2775-2794. DOI: 10.13328/j.cnki.jos.006904 CSTR:
Abstract:Code review is one of the best practices widely used in modern software development, which is crucial for ensuring software quality and strengthening engineering capability. Code review comments (CRCs) are one of the main and most important outputs of code reviews. CRCs are not only the reviewers’ perceptions of code quality but also the references for authors to fix code defects and improve quality. Nowadays, although a number of software organizations have developed guidelines for performing code reviews, there are still few effective methods for evaluating the quality of CRCs. To provide an explainable and automated quality evaluation of CRCs, this study conducts a series of empirical studies such as literature reviews and case analyses. Based on the results of the empirical studies, the study proposes a multi-label learning-based approach for evaluating the quality of CRCs. Experiments are carried out by using a large software enterprise-specific dataset that includes a total of 17 000 CRCs from 34 commercial projects. The results indicate that the proposed approach can effectively evaluate the quality attributes and grades of CRCs. The study also provides some modeling experiences such as CRC labeling and verification, so as to help software organizations struggling with code reviews better implement the proposed approach.
QIAN Zhong-Sheng , YU Qing-Yuan , ZHANG Ding , YAO Chang-Sen , QIN Lang-Yue , CHENG Yi-Wei
2024, 35(6):2795-2820. DOI: 10.13328/j.cnki.jos.006905 CSTR:
Abstract:Machine learning methods can be well combined with software testing to enhance test effect, but few scholars have applied it to test data generation. In order to further improve the efficiency of test data generation, a chained model combining support vector machine (SVM) and extreme gradient boosting (XGBoost) is proposed, and multi-path test data generation is realized by a genetic algorithm based on the chained model. Firstly, this study uses certain samples to train several sub-models (i.e., SVM and XGBoost) for predicting the state of path nodes, filters the optimal sub-models based on the prediction accuracy value of the sub-models, and links the optimal sub-models in sequence according to the order of the path nodes, so as to form a chained model, namely chained SVM and XGBoost (C-SVMXGBoost). When using the genetic algorithm to generate test cases, the study makes use of the chained model that is trained instead of the instrumentation method to obtain the test data coverage path (i.e., predicted path), finds the path set with the predicted path similar to the target path, performs instrumentation verification on the predicted path with similar path sets, obtains accurate paths, and calculates fitness values. In the crossover and mutation process, excellent test cases with a large path level depth in the sample set are introduced for reuse to generate test data covering the target path. Finally, individuals with higher fitness during the evolutionary generation are saved, and C-SVMXGBoost is updated, so as to further improve the test efficiency. Experiments show that C-SVMXGBoost is more suitable for solving the path prediction problem and improving the test efficiency than other chained models. Moreover, compared with the existing classical methods, the proposed method can increase the coverage rate by up to 15%. The mean evolutionary algebra is also reduced, and the reduction percentage can reach 65% on programs of large size.
XIANG Yi , HUANG Han , LUO Chuan , YANG Xiao-Wei
2024, 35(6):2821-2843. DOI: 10.13328/j.cnki.jos.006906 CSTR:
Abstract:Software product line testing is challenging. The similarity-based testing method can improve testing coverage and fault detection rate by increasing the diversity of test suites. Due to its excellent scalability and satisfactory testing effects, the method has become one of the most important test methods for software product lines. How to generate diverse test cases and how to maintain the diversity of test suites are two key issues in this test method. To handle the above issues, this study proposes a software product line test algorithm based on diverse SAT solvers and novelty search (NS). Specifically, the algorithm simultaneously uses two types of diverse SAT solvers to generate diverse test cases. In particular, in order to improve the diversity of stochastic local search SAT solvers, the study proposes a general strategy that is based on a probability vector to generate candidate solutions. Furthermore, two archiving strategies inspired by the idea of the NS algorithm are designed and applied to maintain both the global and local diversity of the test suites. Ablation and comparison experiments on 50 real software product lines verify the effectiveness of both the diverse SAT solvers and the two archiving strategies, as well as the superiority of the proposed algorithm over other state-of-the-art algorithms.
SUN Chang-Ai , WU Si-Yi , ZHANG Shou-Feng , FU An
2024, 35(6):2844-2862. DOI: 10.13328/j.cnki.jos.006907 CSTR:
Abstract:Business process execution language (BPEL) is an executable web service composition language. Compared with traditional programs, BPEL programs are significantly different in terms of programming models and execution modes. These new features make it challenging to locate and fix faults of BPEL programs detected during the testing process. In addition, fault fixing techniques developed for traditional software cannot be used for BPEL programs directly. This study proposes a fault fixing technique for BPEL programs based on template matching, namely BPELRepair from the perspective of mutation analysis. In order to overcome the high computational overhead of the mutation analysis-based fault fixing technique, a set of optimization strategies are proposed from three perspectives, namely patch generation, test case selection, and termination condition. A supporting tool is developed to improve the automation and efficiency of fault fixing for BPEL programs. An empirical study is used to evaluate the effectiveness of the proposed fault fixing technique and optimization strategies. The experimental results show that the proposed technique can successfully fix about 53% of faults of BPEL programs, and the proposed optimization strategies can significantly reduce the overhead in terms of search matching, patch program verification, test case execution, and fault fixing.
ZHOU Guang-You , XIE Qi , YU Xiao
2024, 35(6):2863-2879. DOI: 10.13328/j.cnki.jos.006910 CSTR:
Abstract:Code search is an important research topic in natural language processing and software engineering. Developing efficient code search algorithms can significantly improve the code reuse and the working efficiency of software developers. The task of code search is to retrieve code fragments that meet the requirements from the massive code repository by taking the natural language describing the function of the code fragments as input. Although the sequence model-based code search method, namely DeepCS has achieved promising results, it cannot capture the deep semantics of the code. GraphSearchNet, a code search method based on graph embedding, can alleviate this problem, but it does not perform fine-grained matching on codes and texts and ignores the global relationship between code graphs and text graphs. To address the above limitations, this study proposes a code search method based on a relational graph convolutional network, which encodes the constructed text graphs and code graphs, performs fine-grained matching on text query and code fragments at the node level, and applies neural tensor networks to capture their global relationship. Experimental results on two public datasets show that the proposed method achieves higher search accuracy than state-of-the-art baseline models, namely DeepCS and GraphSearchNet.
2024, 35(6):2880-2902. DOI: 10.13328/j.cnki.jos.006918 CSTR:
Abstract:Third-party library (TPL) detection is an upstream task in the domain of Android application security analysis, and its detection accuracy has a significant impact on its downstream tasks including malware detection, repackaged application detection, and privacy leakage detection. To improve detection accuracy and efficiency, this study proposes a package structure and signature-based TPL detection method, named LibPass, by leveraging the idea of pairwise comparison. LibPass combines primary module identification, TPL candidate identification, and fine-grained detection in a streamlined way. The primary module identification aims at improving detection efficiency by distinguishing the binary code of the main program from that of the introduced TPL. On this basis, a two-stage detection method consisting of TPL candidate identification and fine-grained detection is proposed. The TPL candidate identification leverages the stability of package structure features to deal with obfuscation of applications to improve detection accuracy and identifies candidate TPLs by rapidly comparing package structure signatures to reduce the number of pairwise comparisons, so as to improve the detection efficiency. The fine-grained detection accurately identifies the TPL of a specific version by a finer-grained but more costly pairwise comparison among candidate TPLs. In order to validate the performance and the efficiency of the detection method, three benchmark datasets are built to evaluate different detection capabilities, and experiments are conducted on these datasets. The experimental results are deeply analyzed in terms of detection performance, detection efficiency, and obfuscation resistance, and it is found that LibPass has high detection accuracy and efficiency and can deal with various common obfuscation operations.
2024, 35(6):2903-2922. DOI: 10.13328/j.cnki.jos.006920 CSTR:
Abstract:The broad-learning-based dynamic fuzzy inference system (BL-DFIS) can automatically assemble simplified fuzzy rules and achieve high accuracy in classification tasks. However, when BL-DFIS works on large and complex datasets, it may generate too many fuzzy rules to achieve satisfactory identification accuracy, which adversely affects its interpretability. In order to circumvent such a bottleneck, a fuzzy neural network called feature-augmented random vector functional-link neural network (FA-RVFLNN) is proposed in this study to achieve excellent trade-off between classification performance and interpretability. In the proposed network, the RVFLNN with original data as input is taken as its primary structure, and BL-DFIS is taken as a performance supplement, which implies that FA-RVFLNN contains direct links to boost the performance of the whole system. The inference mechanism of the primary structure can be explained by a fuzzy logic operator (I-OR), owing to the use of Sigmoid activation functions in the enhancement nodes of this structure. Moreover, the original input data with clear meaning also help to explain the inference rules of the primary structure. With the support of direct links, FA-RVFLNN can learn more useful information through enhancement nodes, feature nodes, and fuzzy nodes. The experimental results indicate that FA-RVFLNN indeed eases the problem of rule explosion caused by excessive enhancement nodes in the primary structure and improves the interpretability of BL-DFIS therein (The average number of fuzzy rules is reduced by about 50%), and is still competitive in terms of generalization performance and network size.
YAN Jing-Hui , ZONG Cheng-Qing , XU Jin-An
2024, 35(6):2923-2935. DOI: 10.13328/j.cnki.jos.006927 CSTR:
Abstract:Entity recognition is a key technology for information extraction. Compared with ordinary text, the entity recognition of Chinese medical text is often faced with a large number of nested entities. Previous methods of entity recognition often ignore the entity nesting rules unique to medical text and directly use sequence annotation methods. Therefore, a Chinese entity recognition method that incorporates entity nesting rules is proposed. This method transforms the entity recognition task into a joint training task of entity boundary recognition and boundary first-tail relationship recognition in the training process and filters the results by combining the entity nesting rules summarized from actual medical text in the decoding process. In this way, the recognition results are in line with the composition law of the nested combinations of inner and outer entities in the actual text. Good results have been achieved in public experiments on entity recognition of medical text. Experiments on the dataset show that the proposed method is significantly superior to the existing methods in terms of nested-type entity recognition performance, and the overall accuracy is increased by 0.5% compared with the state-of-the-art methods.
ZHOU Zhi-Yang , DOU Wen-Sheng , LI Shuo , KANG Liang-Yi , WANG Shuai , LIU Jie , YE Dan
2024, 35(6):2936-2950. DOI: 10.13328/j.cnki.jos.006928 CSTR:
Abstract:Detecting out-of-distribution (OOD) samples outside the training set distribution is crucial for deploying deep neural network (DNN) classifiers in the open environment. OOD sample detection is a binary classification problem, which is to classify the input samples into the in-distribution (ID) or OOD categories. Then, the detector itself can be re-bypassed by malicious adversarial attacks. These OOD samples with malicious perturbations are called adversarial OOD samples. Building robust OOD detectors to detect adversarial OOD samples is more challenging. Existing methods usually train DNN through adversarial OOD samples within the neighborhood of auxiliary clean OOD samples to learn separable and robust representations to malicious perturbations. However, due to the distributional differences between the auxiliary OOD training set and original ID training set, training adversarial OOD samples is not effective enough to ensure the robustness of ID boundary against adversarial perturbations. Adversarial ID samples generated from within the neighborhood of (clean) ID samples are closer to the ID boundary and are also effective in improving the adversarial robustness of the ID boundary. This study proposes a semi-supervised adversarial training approach, DiTing, to build robust OOD detectors to detect clean and adversarial OOD samples. This approach treats the adversarial ID samples as auxiliary “near OOD” samples and trains them jointly with other auxiliary clean and adversarial OOD samples to improve the robustness of OOD detection. Experiments show that DiTing has a significant advantage in detecting adversarial OOD samples generated by strong attacks while maintaining state-of-the-art performance in classifying clean ID samples and detecting clean OOD samples.
HU Zi-Rui , WENG Si-Yang , WANG Qing-Shuai , YU Rong , XU Jin-Kai , ZHANG Rong , ZHOU Xuan
2024, 35(6):2951-2973. DOI: 10.13328/j.cnki.jos.006901 CSTR:
Abstract:Hybrid transactional/analytical processing (HTAP) database systems have gained extensive acknowledgment of users due to their full processing support of the mixed workloads in one system, i.e., transactions and analytical queries. Most HTAP database systems tend to maintain multiple data versions or additional replicas to accomplish online analytical processing (OLAP) without downgrading the write performance of online transactional processing (OLTP). This leads to a consistency problem between the data of TP and AP versions. Meanwhile, HTAP database systems face the core challenge of achieving efficient data sharing under resource isolation, and the data-sharing model integrates the trade-off between business requirements for performance and data freshness. To systematically explain the data-sharing model and optimization strategies of existing HTAP database systems, this study first utilizes the consistency models to define the data-sharing model and classify the consistency models for HTAP data sharing into three categories, namely, linear consistency, sequential consistency, and session consistency, according to the differences between TP generated versions and AP query versions. After that, it takes a deep dive into the whole process of data-sharing models from three core issues, i.e., data-version number distribution, data version synchronization, and data version tracking, and provides the implementation methods of different consistency models. Furthermore, this study takes a dozen of classic and popular HTAP database systems as examples for an in-depth interpretation of the implementation methods. Finally, it summarizes and analyzes the optimization strategies of version synchronization, tracking, and recycling modules involved in the data-sharing process and predicts the optimization directions of the data-sharing models. It is concluded that the self-adaptability of the data synchronization scope, self-tuning of the data synchronization cycle, and freshness-bound constraint control under sequential consistency are the possible means for better performance of HTAP database systems and higher freshness.
QIAN Zhong-Sheng , YE Zu-Lai , YAO Chang-Sen , ZHANG Ding , HUANG Heng , QIN Lang-Yue
2024, 35(6):2974-2998. DOI: 10.13328/j.cnki.jos.006915 CSTR:
Abstract:Driven by mature data mining technologies, the recommendation system has been able to efficiently utilize explicit and implicit information such as score data and behavior traces and then combine the information with complex and advanced deep learning technologies to achieve sound results. Meanwhile, its application requirements also drive the in-depth mining and utilization of basic data and the load reduction of technical requirements to become research hotspots. On this basis, a lightweight recommendation model, namely LG_APIF is proposed, which uses the graph convolutional network (GCN) method to deeply integrate information. According to behavior memory, the model employs Ebbinghaus forgetting curve to simulate the users’ interest change process and adopts linear regression and other relatively lightweight traditional methods to mine adaptive periods and other depth information of items. In addition, it analyzes users’ current interest distribution and calculates the interest value of the item to obtain users’ potential interest type. It further constructs the graph structure of the user-type-item triplet and uses GCN technology after load reduction to generate the final item recommendation list. The experiments have verified the effectiveness of the proposed method. Through the comparison with eight classical models on the datasets of Last.fm, Douban, Yelp, and MovieLens, it is found that the Precision, Recall, and NDCG of the proposed method are improved, with an average improvement of 2.11% on Precision, 1.01% on Recall, and 1.48% on NDCG, respectively.
XU Lan-Tian , LI Rong-Hua , DAI Yong-Heng , WANG Guo-Ren
2024, 35(6):2999-3012. DOI: 10.13328/j.cnki.jos.006926 CSTR:
Abstract:Hypergraphs are generalized representations of ordinary graphs, which are common in many application areas, including the Internet, bioinformatics, and social networks. The independent set problem is a fundamental research problem in the field of graph analysis. Most of the traditional independent set algorithms are targeted for ordinary graph data, and how to achieve efficient maximum independent set mining on hypergraph data is an urgent problem to be solved. To address this problem, this study proposes a definition of hypergraph independent sets. Firstly, two properties of hypergraph independent set search are analyzed, and then a basic algorithm based on the greedy strategy is proposed. Then a pruning framework for hypergraph approximate maximum independent set search is proposed, i.e., a combination of exact pruning and approximate pruning, which reduces the size of the graph by the exact pruning strategy and speeds up the search by the approximate pruning strategy. In addition, four efficient pruning strategies are proposed in this study, and a theoretical proof of each pruning strategy is presented. Finally, experiments are conducted on 10 real hypergraph data sets, and the results show that the pruning algorithm can efficiently search for hypergraph maximum independent sets that are closer to the real results.
ZHU Guang-Hui , CHEN Wen-Zhong , ZHU Zhen-Nan , YUAN Chun-Feng , HUANG Yi-Hua
2024, 35(6):3013-3035. DOI: 10.13328/j.cnki.jos.006894 CSTR:
Abstract:Deep learning has achieved great success in image classification, natural language processing, and speech recognition. Data augmentation can effectively increase the scale and diversity of training data, thereby improving the generalization of deep learning models. However, for a given dataset, a well-designed data augmentation strategy relies heavily on expert experience and domain knowledge and requires repeated attempts, which is time-consuming and labor-intensive. In recent years, automated data augmentation has attracted widespread attention from the academic community and the industry through the automated design of data augmentation strategies. To solve the problem that existing automated data augmentation algorithms cannot strike a good balance between prediction accuracy and search efficiency, this study proposes an efficient automated data augmentation algorithm SGES AA based on a self-guided evolution strategy. First, an effective continuous vector representation method is designed for the data augmentation strategy, and then the automated data augmentation problem is converted into a search problem of continuous strategy vectors. Second, a strategy vector search method based on the self-guided evolution strategy is presented. By introducing historical estimation gradient information to guide the sampling and updating of exploration points, it can effectively avoid the local optimal solution while improving the convergence of the search process. The results of extensive experiments on image, text, and speech datasets show that the proposed algorithm is superior to or matches the current optimal automated data augmentation methods without significantly increasing the time consumption of searches.
HU Kai , JIANG Shuai , LIU Dong , GAO Xie-Ping
2024, 35(6):3036-3051. DOI: 10.13328/j.cnki.jos.006895 CSTR:
Abstract:The morphological changes in retina boundaries are important indicators of retinal diseases, and the subtle changes can be captured by images obtained by optical coherence tomography (OCT). The retinal layer boundary segmentation based on OCT images can assist in the clinical judgment of related diseases. In OCT images, due to the diverse morphological changes in retina boundaries, the key boundary-related information, such as contexts and saliency boundaries, is crucial to the judgment and segmentation of layer boundaries. However, existing segmentation methods lack the consideration of the above information, which results in incomplete and discontinuous boundaries. To solve the above problems, this study proposes a coarse-to-fine method for the segmentation of retinal layer boundary in OCT images based on the end-to-end deep neural networks and graph search (GS), which avoids the phenomenon of “faults” common in non-end-to-end methods. In coarse segmentation, the attention global residual network (AGR-Net), an end-to-end deep neural network, is proposed to extract the above key information in a more sufficient and effective way. Specifically, a global feature module (GFM) is designed to capture the global context information of OCT images by scanning from four directions of the images. After that, the channel attention module (CAM) and GFM are sequentially combined and embedded in the backbone network to realize saliency modeling of context information of the retina and its boundaries. This effort effectively solves the problem of wrong segmentation caused by retina deformation and insufficient information extraction in OCT images. In fine segmentation, a GS algorithm is adopted to remove isolated areas or holes from the coarse segmentation results obtained by AGR-Net. In this way, the boundary keeps a fixed topology, and it is continuous and smooth, which further optimizes the overall segmentation results and provides a more complete reference for medical clinical diagnosis. Finally, the performance of the proposed method is evaluated from different perspectives on two public datasets, and the method is compared with the latest methods. The comparative experiments show that the proposed method outperforms the existing methods in terms of segmentation accuracy and stability.
ZHANG Hao-Nan , GUO Jie , QIN Hao-Yu , FU Xi-Hao , GUO Yan-Wen
2024, 35(6):3052-3068. DOI: 10.13328/j.cnki.jos.006921 CSTR:
Abstract:With the development of modern information technology, people’s demand for high resolution and realistic visual perception of image display devices has increased, which has put forward higher requirements for computer software and hardware and brought many challenges to rendering technology in terms of performance and workload. Using machine learning technologies such as deep neural networks to improve the quality and performance of rendered images has become a popular research method in computer graphics, while upsampling low-resolution images through network inference to obtain clearer high-resolution images is an important way to improve image generation performance and ensure high-resolution details. The geometry buffers (G-buffers) generated by the rendering engine in the rendering process contain much semantic information, which help the network learn scene information and features effectively and then improve the quality of upsampling results. In this study, a super-resolution method for rendered contents in low resolution based on deep neural networks is designed. In addition to the color image of the current frame, the method uses high-resolution G-buffers to assist in the calculation and reconstruct the high-resolution content details. The method also leverages a new strategy to fuse the features of high-resolution buffers and low-resolution images, which implements a multi-scale fusion of different feature information in a specific fusion module. Experiments demonstrate the effectiveness of the proposed fusion strategy and module, and the proposed method shows obvious advantages, especially in maintaining high-resolution details, when compared with other image super-resolution methods.