Volume 34,Issue 11,2023 Table of Contents

Parallel Structured Sparse Triangular Solver for GPU Platform

CHEN Dao-Kun , YANG Chao , LIU Fang-Fang , MA Wen-Jing

2023, 34(11):4941-4951. DOI: 10.13328/j.cnki.jos.006720

Abstract (1151) HTML (1239) PDF 6.47 M (2840) Comment (0) Favorites

Abstract:Sparse triangular solver (SpTRSV) is a vital operation in preconditioners. In particular, in scientific computing program that solves partial differential equation systems iteratively, structured SpTRSV is a common type of issue and often a performance bottleneck that needs to be addressed by the scientific computing program. The commercial mathematical libraries tailored to the graphics processing unit (GPU) platform, represented by CUSPARSE, parallelize SpTRSV operations by level-scheduling methods. However, this method is weakened by time-consuming preprocessing and serious GPU thread idle when it is employed to deal with structured SpTRSV issues. This study proposes a parallel algorithm tailored to structured SpTRSV issues. The proposed algorithm leverages the special non-zero element distribution pattern of structured SpTRSV issues during task allocation to skip the preprocessing and analysis of the non-zero element structure of the input issue. Furthermore, the element-wise operation strategy used in the existing level-scheduling methods is modified. As a result, the problem of GPU thread idle is effectively alleviated, and the memory access latency of some non-zero elements in the matrix is concealed. This study also adopts a state variable compression technique according to the task allocation characteristics of the proposed algorithm, significantly improving the cache hit rate of the algorithm in state variable operations. Additionally, several hardware features of the GPU, including predicated execution, are investigated to comprehensively optimize algorithm implementation. The proposed algorithm is tested on NVIDIA V100 GPU, achieving an average 2.71×acceleration over CUSPARSE and a peak effective memory-access bandwidth of 225.2 GB/s. The modified element-wise operation strategy, combined with a series of other optimization measures for GPU hardware, attains a prominent optimization effect by yielding a nearly 115% increase in the effective memory-access bandwidth of the proposed algorithm.

Effective Implementation of Matrix Inversion Based on Batched LU Decomposition on GPU

LIU Shi-Fang , ZHAO Yong-Hua , HUANG Rong-Feng , YU Tian-Yu , ZHANG Xin-Yin

2023, 34(11):4952-4972. DOI: 10.13328/j.cnki.jos.006727

Abstract (865) HTML (1679) PDF 9.11 M (2479) Comment (0) Favorites

Abstract:This study presents the existing and optimized implementation methods for batched lower-upper (LU) matrix decomposition and batched inversion algorithms on the graphics processing unit (GPU). For batched LU decomposition, the study analyzes the number of reads and writes to the global memory when the Left-looking, Right-looking, and other commonly used blocked LU decomposition algorithms are implemented on the GPU. The blocked Left-looking algorithm with less memory access data is selected due to the characteristics of the GPU architecture. In the process of pivoting during LU decomposition, a parallel binary tree search algorithm suitable for the GPU architecture is adopted. In addition, to reduce the impact of the row interchange process caused by the pivoting on the performance of the algorithm, this study proposes two optimization techniques, namely, the Warp-based packet row interchange and row interchange delay. For batched inversion after LU decomposition, this study investigates the correction method employed in the matrix inversion process. When batched inversion is implemented on the GPU, a blocked matrix inversion algorithm with delayed correction is adopted to reduce access to the global memory during the correction. Furthermore, to speed up data reading and writing, the study adopts the optimization method of using more registers and shared memory and that of performing column interchange to reduce memory access data. In addition, a method of dynamic GPU resource allocation during operation is proposed to avoid the idleness of threads and the waste of shared memory and other GPU resources. Compared with the static one-time resource allocation method, the dynamic allocation method improves the performance of the algorithm significantly. Finally, 10000 random matrices with sizes between 33 and 190 data are tested on the TITAN V GPU, and the types of the tested data are single-precision complex, double-precision complex, single-precision real, and double-precision real. The floating-point arithmetic performance of the batched LU decomposition algorithm implemented in this study reaches about 2 TFLOPS, 1.2 TFLOPS, 1 TFLOPS, and 0.67 TFLOPS, respectively. This algorithm achieves the highest speedup of about 9×, 8×, 12×, and 13×, respectively, compared with the implementation in CUBLAS. The highest speedup achieved is about 1.2×–2.5×, 1.2×–3.2×, 1.1×–3×and 1.1×–2.7×, respectively, compared with the implementation in MAGMA. The floating-point arithmetic performance of the proposed batched inversion algorithm can reach about 4 TFLOPS, 2 TFLOPS, 2.2 TFLOPS, and 1.2 TFLOPS, respectively. This algorithm achieves the highest speedup of about 5×, 4×, 7×, and 7×, respectively, compared with the implementation in CUBLAS. The speedup is about 2×–3×, 2×–3×, 2.8×–3.4×and 1.6×–2×, respectively, compared with the implementation in MAGMA.

Unsupervised Cross-modal Hash Retrieval Fusing Multiple Instance Relations

LI Zhi-Xin , HOU Chuan-Wen , XIE Xiu-Min

2023, 34(11):4973-4988. DOI: 10.13328/j.cnki.jos.006742

Abstract (696) HTML (1726) PDF 4.46 M (2601) Comment (0) Favorites

Abstract:Most cross-modal hash retrieval methods only use cosine similarity for feature matching, employ one single calculation method, and do not take into account the impact of instance relations on performance. For this reason, the study proposes a novel method based on reasoning in multiple instance relation graphs. Global and local instance relation graphs are generated by constructing similarity matrices to fully explore the fine-grained relations among the instances. Similarity reasoning is then conducted on the basis of the multiple instance relation graphs. For this purpose, reasoning is performed within the relation graphs in the image and text modalities, respectively. Then, the relations within each modality are mapped to the instance graphs for reasoning. Finally, reasoning within the instance graphs is performed. Furthermore, the neural network is trained by a step-by-step training strategy to adapt to the features of the image and text modalities. Experiments on the MIRFlickr and NUS-WIDE datasets demonstrate that the proposed method has distinct advantages in the metric mean average precision (mAP) and obtains a favorable Top-k-Precision curve. This also indicates that the proposed method deeply explores instance relations and thereby significantly improves the retrieval performance.

Formal Verification of Consensus Protocols: Survey and Perspective

GE Ning , HE Yu-Kai , ZHAI Shu-Mao , LI Xiao-Zhou , ZHANG Li

2023, 34(11):4989-5007. DOI: 10.13328/j.cnki.jos.006684

Abstract (1775) HTML (2632) PDF 7.65 M (5105) Comment (0) Favorites

Abstract:Distributed systems play an important role in computing environments. Consensus protocols are employed to guarantee consistency among nodes. Design errors in the consensus protocols might cause failure in system operation and bring catastrophic consequences to humans and the environment. Therefore, it is important to prove the correctness of consensus protocols. Formal verification can strictly prove the correctness of target properties in designed models, which is suitable for verifying consensus protocols. However, the expanding scale of distributed systems results in more complicated issues and challenges to formal verification of consensus protocols. The method for formal verification of the consensus protocol design and increase in the verification scale are significant research issues in the formal verification of consensus protocols. This study investigates the current research on the employment of formal methods to verify consensus protocols, summarizes the key modeling methods and verification technologies, and proposes future research directions in this field.

State-of-the-art Survey on Fuzz Testing for Deep Learning System

DAI He-Peng , SUN Chang-Ai , JIN Hui , XIAO Ming-Jun

2023, 34(11):5008-5028. DOI: 10.13328/j.cnki.jos.006679

Abstract (2278) HTML (2944) PDF 7.37 M (5869) Comment (0) Favorites

Abstract:Deep learning (DL) systems have powerful learning and reasoning capabilities and are widely employed in many fields including unmanned vehicles, speech recognition, intelligent robotics, etc. Due to the dataset limit and dependence on manually labeled data, DL systems are prone to unexpected behaviors. Accordingly, the quality of DL systems has received widespread attention in recent years, especially in safety-critical fields. Fuzz testing with strong fault-detecting ability is utilized to test DL systems, which becomes a research hotspot. This study summarizes existing fuzz testing for DL systems in the aspects of test case generation (including seed queue construction, seed selection, and seed mutation), test result determination, and coverage analysis. Additionally, commonly used datasets and metrics are introduced. Finally, the study prospects for the future development of this field.

GoGCN for Interaction Prediction Between Classes in Software System

HE Peng , WEI Cao , Lü Sheng-Kai , ZENG Cheng , LI Bing

2023, 34(11):5029-5041. DOI: 10.13328/j.cnki.jos.006678

Abstract (619) HTML (993) PDF 6.90 M (2297) Comment (0) Favorites

Abstract:As a software system is a complex artifact, the interaction between classes exerts a potential impact on software quality, with the cascading propagation effect of software defects as a typical case. How to accurately predict the reasonable relationship between classes in the software system and optimize the design structure is still an open problem in software quality assurance. From the perspective of software network, this study comprehensively considers the interactions between classes in a software system (class external graph, CEG), and those between internal methods of each class (class internal graph, CIG). The software system is abstracted into a software network with a graph of graphs structure. As a result, a class interaction prediction method based on the graph of graphs convolutional network is proposed. Firstly, the initial characteristics of class nodes are obtained through the convolution of each CIG. Then the representation vector of class nodes is updated through the convolution of CEG, and finally, the evaluation values between class nodes are calculated for interaction prediction. The experimental results on six Java open source projects show that the graph of graphs structure is helpful to improve the representation of software system structure. The average growth rates of the area under the curve (AUC) and average precision (AP) of the proposed method are more than 5.5% compared with those of the conventional network embedding methods. In addition, the average growth rates of AUC and AP are more than 9.36% and 5.22%, respectively compared with those of the two peer methods.

BETASCO: Consortium Blockchain System Based on Smart Contract-oriented Sharding

WU Kai-Dong , MA Yun , CAI Hua-Qian , JING Xiang , HUANG Gang

2023, 34(11):5042-5057. DOI: 10.13328/j.cnki.jos.006741

Abstract (825) HTML (1265) PDF 7.95 M (2335) Comment (0) Favorites

Abstract:Blockchain-based decentralized applications have been providing robust, reliable, and durable services in multiple fields, such as encrypted digital currency, cloud storage, and Internet of Things. However, the throughput capacity of blockchain can no longer meet the increasing performance requirements of decentralized applications. Sharding is the current mainstream performance optimization technology for blockchain. Nevertheless, the existing blockchain sharding approaches, mainly focusing on transactions among users, are not always applicable to decentralized applications dominated by the transactions of smart-contract invocation. To solve the above problem, this study designs and implements the consortium blockchain system BETASCO based on smart contract-oriented sharding. BETASCO provides a shard that serves as an independent execution environment for each smart contract, routes transactions to the shards holding the target smart contracts by a contract location service based on the distributed hash table (DHT), and supports communication and collaboration needs across smart contracts by availing the asynchronous invocation mechanism among smart contracts. By virtualizing the nodes, BETASCO allows each node to join multiple shards to support parallel executions of multiple smart contracts on the same set of nodes. The results of experiments show that the overall throughput capacity of BETASCO linearly increases as the number of smart contracts grows, and the throughput capacity for the execution of a single smart contract is comparable to that of HyperLedger Fabric.

Survey on Learning Communities in Online Education Environment

ZHANG Jun-Tao , YANG Xian-Di , SONG Wei , ZHANG Xue-Long , HE Yu-Yang , PENG Zhi-Yong

2023, 34(11):5058-5083. DOI: 10.13328/j.cnki.jos.006735

Abstract (1492) HTML (2785) PDF 9.10 M (5304) Comment (0) Favorites

Abstract:With the in-depth integration of information technology and education, booming online education has become the new normal of the education informatization process and generated massive amounts of education data. However, online education also faces high dropout rates, low course completion rates, insufficient supervision, and other problems. How to mine and analyze the massive education data is the key to solving these problems. A learning community is a learning organization with learners as its core element, and it emphasizes the interactive communication, resource sharing, and collaborative learning among the learners in the learning process so that common learning tasks or goals can be completed. This study reviews, analyzes, and discusses the prospect of the research on learning communities in the online education environment. Firstly, the background and importance of learning communities in the online education environment are outlined. Secondly, the definitions of a learning community in different disciplines are presented. Thirdly, the construction methods for three types of learning communities, namely, homogeneous, heterogeneous, and hybrid learning communities, are summarized. Fourthly, the management mechanism for learning communities is discussed from the three aspects of sharing, collaboration, and incentive. Last but not least, directions for future research on learning communities are suggested.

Influence Maximization for Signed Networks Based on Evolutionary Deep Reinforcement Learning

MA Li-Jia , HONG Hua-Ping , LIN Qiu-Zhen , LI Jian-Qiang , GONG Mao-Guo

2023, 34(11):5084-5112. DOI: 10.13328/j.cnki.jos.006728

Abstract (671) HTML (1444) PDF 10.34 M (2309) Comment (0) Favorites

Abstract:In recent years, the research on influence maximization (IM) for social networks has attracted extensive attention from the scientific community due to the emergence of major application issues, such as information dissemination on the Internet and the blocking of COVID-19’s transmission chain. IM aims to identify a set of optimal influence seed nodes that would maximize the influence of information dissemination according to the propagation model for a specific application issue. The existing IM algorithms mainly focus on unidirectional-link influence propagation models and simulate IM issues as issues of optimizing the selection of discrete influence seed node combinations. However, they have a high computational time complexity and cannot be applied to solve IM issues for signed networks with large-scale conflicting relationships. To solve the above problems, this study starts by building a positive-negative influence propagation model and an IM optimization model readily applicable to signed networks. Then, the issue of selecting discrete seed node combinations is transformed into one of continuous network weight optimization for easier optimization by introducing a deep Q network composed of neural networks to select seed node sets. Finally, this study devises an IM algorithm based on evolutionary deep reinforcement learning for signed networks (SEDRL-IM). SEDRL-IM views the individuals in the evolutionary algorithm as strategies and combines the gradient-free global search of the evolutionary algorithm with the local search characteristics of reinforcement learning. In this way, it achieves the effective search for the optimal solution to the weight optimization issue of the Deep Q Network and further obtains the set of optimal influence seed nodes. Experiments are conducted on the benchmark signed network and real-world social network datasets. The extensive results show that the proposed SEDRL-IM algorithm is superior to the classical benchmark algorithms in both the influence propagation range and the solution efficiency.

Low-resource Neural Machine Translation with Multi-strategy Prototype Generation

YU Zhi-Qiang , YU Zheng-Tao , HUANG Yu-Xin , GUO Jun-Jun , XIAN Yan-Tuan

2023, 34(11):5113-5125. DOI: 10.13328/j.cnki.jos.006689

Abstract (508) HTML (1246) PDF 7.24 M (2151) Comment (0) Favorites

Abstract:In rich-resource scenarios, using similarity translation as the target prototype sequence can improve the performance of neural machine translation. However, in low-resource scenarios, due to the lack of parallel corpus resources, the prototype sequence cannot be matched, or the sequence quality is poor. To address this problem, this study proposes a low-resource neural machine translation approach with multi-strategy prototype generation, and the approach includes two phases. (1) Keyword matching and distributed representation matching are combined to retrieve prototype sequences, and the pseudo prototype generation approach is leveraged to generate available prototype sequences during retrieval failures. (2) The conventional encoder-decoder framework is improved for the effective employment of prototype sequences. The encoder side utilizes additional encoders to receive prototype sequences. The decoder side, while employing a gating mechanism to control information flow, adopts improved loss functions to reduce the negative impact of low-quality prototype sequences on the model. The experimental results on multiple datasets show that the proposed method can effectively improve the translation performance compared with the baseline models.

Deep Knowledge Tracing Model Based on Embedding of Fused Multiple Concepts

JU Sheng-Gen , KANG Rui , ZHAO Rong-Mei , SUN Jie-Ping

2023, 34(11):5126-5142. DOI: 10.13328/j.cnki.jos.006724

Abstract (1139) HTML (1262) PDF 7.74 M (3832) Comment (0) Favorites

Abstract:The task of knowledge tracing is to trace the changes in students’ knowledge state and predict their future performance in learning according to their historical learning records. In recent years, knowledge tracing models based on attention mechanisms are markedly superior to traditional knowledge tracing models in both flexibility and prediction performance. Only taking into account exercises involving single concept, most of the existing deep models cannot directly deal with exercises involving multiple concepts, which are, nevertheless, vast in intelligent education systems. In addition, how to improve interpretability is one of the key challenges facing deep knowledge tracing models. To solve the above problems, this study proposes a deep knowledge tracing model based on the embedding off used multiple concepts that considers the relationships among the concepts in exercises involving multiple concepts. Furthermore, the study puts forward two novel embedding methods for multiple concepts and combines educational psychology models with forgetting factors to improve prediction performance and interpretability. Experiments reveal the superiority of the proposed model over existing models in prediction performance on large-scale real datasets and verify the effectiveness of each module of the proposed model.

Adversarial Sample Generation Method Based on Chinese Features

LI Xiang-Ge , LUO Hong , SUN Yan

2023, 34(11):5143-5161. DOI: 10.13328/j.cnki.jos.006744

Abstract (561) HTML (1525) PDF 6.30 M (2339) Comment (0) Favorites

Abstract:Deep neural networks are vulnerable to attacks from adversarial samples. For instance, in a text classification task, the model can be fooled by modifying a few characters, words, or punctuation marks in the original text to change the classification result. Currently, studies of Chinese adversarial samples are limited in the field of natural language processing (NLP), and they fail to give due consideration to the language features of Chinese. This study proposes CWordCheater, a character-level and word-level high-quality method to generate adversarial samples covering the aspects of pronunciation, glyphs, and punctuation marks by approaching from the Chinese sentiment classification scenarios and taking into account the pictographic, alphabetic, and other language features of Chinese. The ConvAE network is adopted to embed Chinese visual vectors for the replacement modes of visually similar characters and further obtain the candidate pool of such characters for replacement. Moreover, a semantic constraint method based on universal sentence encoder (USE) distance is proposed to avoid the semantic offset in the adversarial sample. Finally, the study proposes a set of multi-dimensional evaluation methods to evaluate the quality of adversarial samples from the two aspects of attack effect and attack cost. Experiment results show that CWordAttacker can reduce the classification accuracy by at least 27.9% on multiple classification models and multiple datasets and has a lower perturbation cost based on vision and semantics.

Multi-view Microblog Topic Detection Based on Heterogeneous Social Context

HE Rui-Fang , WANG Hao-Cheng , LIU Hong-Yu , WANG Bo

2023, 34(11):5162-5178. DOI: 10.13328/j.cnki.jos.006729

Abstract (490) HTML (1341) PDF 3.99 M (2066) Comment (0) Favorites

Abstract:Social media topic detection aims to mine latent topic information from large-scale short posts. It is a challenging task as posts are short in form and informal in expression and user interactions in social media are complex and diverse. Previous studies only consider the textual content of posts or simultaneously model social contexts in homogeneous situations, ignoring the heterogeneity of social networks. However, different types of user interactions, such as forwarding and commenting, could suggest different behavior patterns and interest preferences and reflect different attention to the topic and understanding of the topic. In addition, different users have different influences on the development and evolution of the same topic. Specifically, compared with ordinary users, the leading authoritative users in a community play a more important role in topic inference. For the above reasons, this study proposes a novel multi-view topic model (MVTM) to infer more complete and coherent topics by encoding heterogeneous social contexts in the microblog conversation network. For this purpose, an attributed multiplex heterogeneous conversation network is built according to the interaction relationships among users and decomposed into multiple views with different interaction semantics. Then, the embedded representation of specific views is obtained by leveraging neighbor-level and interaction-level attention mechanisms, with due consideration given to different types of interactions and the importance of different users. Finally, a multi-view neural variational inference method is designed to capture the deep correlations among different views and adaptively balance their consistency and independence, thereby obtaining more coherent topics. Experiments are conducted on a Sina Weibo dataset covering three months, and the results reveal the effectiveness of the proposed method.

Multiple-choice Reading Comprehension Approach Based on Multi-view Graph Encoding

YU Xiao-Yan , HE Shi-Zhu , SONG Ran , LIU Kang , ZHAO Jun , ZHOU Yong-Bin

2023, 34(11):5179-5190. DOI: 10.13328/j.cnki.jos.006730

Abstract (442) HTML (1201) PDF 7.91 M (1967) Comment (0) Favorites

Abstract:Multiple-choice reading comprehension typically adopts the two-stage pipeline framework of evidence extraction and answer prediction, and the effect of answer prediction highly depends on evidence sentence extraction. Traditional evidence extraction methods mostly rely on phrase matching or supervise evidence extraction with noise labels. The resultant unsatisfactory accuracy significantly reduces the performance of answer prediction. To address the above problem, this study proposes a multiple-choice reading comprehension method based on multi-view graph encoding in a joint learning framework. The correlations among the sentences in the text and those of such sentences with question sentences are fully explored from multiple views to effectively model evidence sentences and their relationships. Moreover, evidence extraction and answer prediction tasks are jointly trained so that the strong correlations of the evidence with the answers can be exploited for joint learning, thereby improving the performance of evidence extraction and answer prediction. Specifically, this method encodes texts, questions, and candidate answers jointly with the multi-view graph encoding module. The relationships among the texts, questions, and candidate answers are captured from the three views of statistical characteristics, relative distance, and deep semantics, thereby obtaining question-answer-aware text encoding features. Then, a joint learning module combining evidence extraction with answer prediction is built to strengthen the relationships of evidence with answers through joint training. The evidence extraction submodule is designed to select evidence sentences and fuse the results with text encoding features selectively. The fusion results are then used by the answer prediction submodule to complete the answer prediction. Experimental results on the multiple-choice reading comprehension datasets ReCO and RACE demonstrate that the proposed method attains a higher ability to select evidence sentences from texts and ultimately achieves higher accuracy of answer prediction. In addition, joint learning combining evidence extraction with answer prediction significantly alleviates the error accumulation problem induced by the traditional pipeline framework.

Twin-delayed-based Deep Deterministic Policy Gradient Method Integrating Gravitational Search

XU Ping-An , LIU Quan , HAO Shao-Pu , ZHANG Li-Hua

2023, 34(11):5191-5204. DOI: 10.13328/j.cnki.jos.006740

Abstract (476) HTML (1076) PDF 6.13 M (1855) Comment (0) Favorites

Abstract:In recent years, deep reinforcement learning has achieved impressive results in complex control tasks. However, its applicability to real-world problems has been seriously weakened by the high sensitivity of hyperparameters and the difficulty in guaranteeing convergence. Metaheuristic algorithms, as a class of black-box optimization methods simulating the objective laws of nature, can effectively avoid the sensitivity of hyperparameters. Nevertheless, they are still faced with various problems, such as the inability to adapt to a huge scale of parameters to be optimized and the low efficiency of sample usage. To address the above problems, this study proposes the twin delayed deep deterministic policy gradient based on a gravitational search algorithm (GSA-TD3). The method combines the advantages of the two types of algorithms. Specifically, it updates the policy by gradient optimization for higher sample efficiency and a faster learning speed. Moreover, it applies the population update method based on the law of gravity to the policy search process to make it more exploratory and stable. GSA-TD3 is further applied to a series of complex control tasks, and experiments show that it significantly out performs similar deep reinforcement learning methods at the forefront.

OLAP Optimization Techniques Based on GPU Database

ZHANG Yan-Song , LIU Zhuan , HAN Rui-Chen , ZHANG Yu , WANG Shan

2023, 34(11):5205-5229. DOI: 10.13328/j.cnki.jos.006739

Abstract (715) HTML (1421) PDF 8.67 M (2035) Comment (0) Favorites

Abstract:Graphics processing unit (GPU) databases have attracted a lot of attention from the academic and industrial communities in recent years. Although quite a few prototype systems and commercial systems (including open-source systems) have been developed as next-generation database systems, whether GPU-based online analytical processing (OLAP) engines really outperform central processing unit (CPU)-based systems is still in doubt. If they do, more in-depth research should be conducted on what kind of workload/data/query processing models are more appropriate. GPU-based OLAP engines have two major technical roadmaps: GPU in-memory processing mode and GPU-accelerated mode. The former stores all the datasets in the GPU device memory to take the best advantage of GPU’s computing power and high bandwidth memory. Its drawbacks are that the limited capacity of the GPU device memory restricts the dataset size and that memory-resident data in the sparse access mode reduces the storage efficiency of the GPU display memory. The latter only stores some datasets in the GPU device memory and accelerates computation-intensive workloads by GPU to support large datasets. The key challenges are how to choose the optimal data distribution and workload distribution models for the GPU device memory to minimize peripheral component interconnect express (PCIe) transfer overhead and maximize GPU’s computation efficiency. This study focuses on how to integrate these two technical roadmaps into the accelerated OLAP engine and proposes OLAP Accelerator as a customized OLAP framework for hybrid CPU-GPU platforms. In addition, this study designs three calculation models, namely, the CPU in-memory calculation model, the GPU in-memory calculation model, and the GPU-accelerated model for OLAP, and proposes a vectorized query processing technique for the GPU platform to optimize device memory utilization and query performance. Furthermore, the different technical roadmaps of GPU databases and corresponding performance characteristics are explored. The experimental results show that the vectorized query processing model based on GPU in-memory achieves the best performance and memory efficiency. The performance is 3.1 and 4.2 times faster than that achieved with the datasets OmniSciDB and Hyper, respectively. The partition-based GPU-accelerated mode only accelerates the join workloads to balance the workloads between the CPU and GPU ends and can support larger datasets than those the GPU in-memory mode can support.

Higher-order Hierarchical Embedding Learning and Recommendation Prediction in HIN

XUN Ya-Ling , BI Hui-Min , ZHANG Ji-Fu

2023, 34(11):5230-5248. DOI: 10.13328/j.cnki.jos.006681

Abstract (937) HTML (1129) PDF 6.09 M (3039) Comment (0) Favorites

Abstract:Heterogeneous information network is a representation of heterogeneous data. How to integrate complex semantic information of heterogeneous data is one of the challenges faced by recommendation systems. A higher-order embedded learning framework for heterogeneous information networks based on weak ties featured by semantic information and information transmission abilities is constructed. The framework includes three modules of initial information embedding, high-order information embedding aggregation, and recommendation prediction. The initial information embedding module first adopts the best trust path selection algorithm to avoid information loss caused by sampling a fixed number of neighbors in a full-relational heterogeneous information network. Then the newly defined importance measure factors of multi-task shared characteristics based on multi-head attention are adopted to filter out the semantic information of each node. Additionally, combined with the interactive structure, the network nodes are effectively characterized. The high-order information embedding aggregation module realizes high-order information expression by integrating weak ties and good knowledge representation ability of network embedding. The hierarchical propagation mechanism of heterogeneous information networks is utilized to aggregate the characteristics of sampled nodes into the nodes to be predicted. The recommendation prediction module employs the influence recommendation method of high-order information to complete the recommendation. The framework is characterized by rich embedded nodes, fusion of shared attributes, and implicit interactive information. Finally, the experiments have verified that UI-HEHo can effectively improve the accuracy of rating prediction, as well as the pertinence, novelty and diversity of recommendation generation. Especially in application scenarios with sparse data, UI-HEHo yields good recommendation effects.

CCP-aware Event Planning on Event-based Social Networks

WU Ding-Ming , LIN Jun-Jie , LU Ke-Zhong , XU Yu-Ming

2023, 34(11):5249-5266. DOI: 10.13328/j.cnki.jos.006731

Abstract (452) HTML (1108) PDF 5.21 M (2056) Comment (0) Favorites

Abstract:Event planning on event-based social networks (EBSNs) has been attracting research efforts for decades. The key insight of the event planning problem is assigning a group of users to a set of events, such that a pre-defined objective function is maximized, subject to a set of constraints. In real applications, important factors, such as event conflicts, event capacities, user capacities, social preferences between users, event preferences, abbreviated as conflict, capacity, and preference (CCP), are necessary to be considered in the event planning. This work summarizes these important factors as conflict, capacity, and preference, denoted by CCP for short. Existing works do not consider CCP when computing event plans. Hence, this study proposes CCP-based event planning problem on event-based social networks, so that more reasonable event plans can be obtained. Since this is the first time to propose CCP-based event planning problem, none of the existing methods can be directly applied to solve it. Compared with previous event planning problem that only considers part of CCP factors, the challenges of solving the CCP-based event planning problem include the complexity of the new problem, more constraints involved. Hence, this paper proposes these algorithms to solve the CCP-based event planning problem in efficient and effective ways. Extensive experiments are conducted and the results prove the effectiveness and the efficiency of the proposed algorithms.

Improvement of Time Series Segmentation Technology Based on Matrix Profile

LIU He-He , HE Yan-Qiao , DENG Shi-Zhuo , WU Gang , WANG Bo-Tao

2023, 34(11):5267-5281. DOI: 10.13328/j.cnki.jos.006734

Abstract (661) HTML (1746) PDF 3.31 M (2312) Comment (0) Favorites

Abstract:Time series segmentation is an important research direction in the field of data mining. At present, the time series segmentation technique based on matrix profile (MP) has received increasing attention from researchers and has achieved great research results. However, this technique and its derivative algorithms also have their own short comings. For one thing, the matching of similar subsequences in the case of arcs crossing non-target activity states arises when the fast low-cost semantic segmentation algorithm based on MP is employed for time series segmentation of a given activity state and the nearest neighbors are connected by arcs. For another, the existing segmentation point extraction algorithm uses a given length window when extracting segmentation points. In this case, the segmentation points obtained are highly likely to exhibit large deviations from the real values, which reduces the accuracy. To address the above problems, this study proposes a time series segmentation algorithm limiting the arc cross, namely limit arc curve cross-FLOSS (LAC-FLOSS). This algorithm adds weights to arcs to obtain a kind of weighted arcs and solves the subsequence mismatch problem caused by the state crossing of the arcs by setting a matching distance threshold. In addition, an improved segmentation point extraction algorithm, namely, the improved extract regimes (IER) algorithm, is proposed. This algorithm extracts the extremes from the troughs according to the shape properties of the sequence of corrected arc crossings (CAC), thereby avoiding the problem that segmentation points are obtained at non-inflection points when the windows are used directly. Comparative experiments are conducted on the public datasets datasets_seg and MobiAct, and the results verify the feasibility and effectiveness of the above two solutions.

Measurement Method for Complexity of Software Library Dependency Graph and Its Potential Applications

YU Hai , WANG Ying , XU Mei-Qiu , YANG Bo , XU Chang , ZHU Zhi-Liang

2023, 34(11):5282-5311. DOI: 10.13328/j.cnki.jos.006746

Abstract (658) HTML (1288) PDF 12.57 M (2285) Comment (0) Favorites

Abstract:In the process of software development, software libraries are widely used as they can reduce development time and costs. Consequently, modern software projects contain code from different sources, which makes the systems highly complex and diversified. In addition, various risks come along with the usage of software libraries, such as low quality or security vulnerabilities, seriously affecting the quality of software projects. By analyzing the intensity of the coupling with software libraries, this study quantifies the complexity and diversity introduced by the dependence on the software libraries to the client code. For this purpose, a software boundary graph (SBG) model is constructed according to the method invocation relationships of the client code with the software libraries to distinguish their code boundaries. Then, a metric suite RMS for the complexity of the software library dependency graph is proposed on the basis of the SBG model to quantify the intensity of the coupling with the software from different sources. In the experiment, this study mines the data on all the historical versions of 10 popular software in the Apache open-source community and finally collects 7857 dependency defects among real-world projects. With the above-mentioned real-world data, empirical investigation based on hypothesis testing is conducted according to the proposed complexity metric suite RMS to discuss the following issues: H1: whether boundary nodes with higher risk factors are more likely to introduce more inter-project dependency defects; H2: whether boundary nodes with higher risk factors are more likely to introduce serious inter-project dependency defects; H3: what is the extent to which the value of the metric suite RMS affects the number and severity of introduced inter-project dependency defects. Experimental results show that according to the evaluation with the RMS, the boundary nodes exhibiting higher coupling degrees with the software libraries are more likely to introduce more inter-project dependency defects with higher severity. Moreover, compared with traditional complexity metrics, RMS greatly influences the number and severity of introduced inter-project dependency defects.

SMT: Efficient Authenticated Data Structure for Streaming Data on Blockchain

SUN Yu-Shan , YANG Jing-Cong , XIA Qi , GAO Jian-Bin

2023, 34(11):5312-5329. DOI: 10.13328/j.cnki.jos.006748

Abstract (912) HTML (1411) PDF 7.59 M (2421) Comment (0) Favorites

Abstract:The authenticated data structure (ADS) solves the problem of untrusted servers in outsourced data storage scenarios as users can verify the correctness and integrity of the query results returned by untrusted servers through the ADS. Nevertheless, the security of data owners is difficult to guarantee, and attackers can tamper with the ADS stored by data owners to impede the integrity and correctness verification of query results. Data owners can store the ADS on the blockchain to solve the above problem by leveraging the immutable nature of the blockchain. However, the existing ADS implementation schemes have high maintenance costs on the blockchain and most of them only support the verifiable query of static data. At present, an efficient ADS tailored to the blockchain is still to be designed. By analyzing the gas consumption mechanism of smart contracts and the gas consumption of the ADS based on the traditional Merkle hash tree (MHT), this study proposes SMT, a new ADS, which achieves efficient and verifiable query of streaming data and has a lower gas consumption on the blockchain. Finally, the study verifies the efficiency of SMT both theoretically and experimentally and proves the security of SMT through security analysis.

Survey on FPGA-based High-performance Programmable Data Plane

ZHAO Peng , CHENG Guang , ZHAO De-Yu

2023, 34(11):5330-5354. DOI: 10.13328/j.cnki.jos.006669

Abstract (894) HTML (3038) PDF 5.90 M (3985) Comment (0) Favorites

Abstract:The programmable data plane (PDP), allowing offloading and accelerating network applications, creates revolutionary development opportunities for such applications. Also, it promotes the innovation and evolution of the network by supporting the rapid implementation and deployment of new protocols and services. It has thus been a research hotspot in the field of the network in recent years. With its general computing architecture, rich on-chip resources and extended interfaces, field-programmable gate array (FPGA) provides a variety of implementations of PDP for a wider range of application scenarios. It also offers the possibility to explore more general PDP abstraction. Therefore, FPGA-based PDP (F-PDP) has been widely concerned by the academic and industrial communities. In this study, F-PDP abstraction is described by category. Then, the research progress of key technologies for building network applications with F-PDP is outlined, and programmable network devices based on F-PDP are presented. After that, the application research based on F-PDP in recent years is reviewed in detail from three aspects: improving network performance, building a network measurement framework, and deploying network security applications. Finally, the possible future research trends of F-PDP are discussed.

Performance Prediction for WiFi CSI Localization System Based on Phased Array

TONG Xin-Yu , ZHENG Ding-Chuan , GE Wei-Ping , LIU Xiu-Long , WANG Xin-Bing

2023, 34(11):5355-5375. DOI: 10.13328/j.cnki.jos.006733

Abstract (704) HTML (1802) PDF 7.57 M (2544) Comment (0) Favorites

Abstract:WiFi is one of the most important communication modes at present, and indoor localization systems based on WiFi signals are most promising for widespread deployment and application in daily life. The latest research shows that such a system can achieve submeter-level localization accuracy when it utilizes the channel state information (CSI) obtained during WiFi communication for target localization. However, the accuracy of localization in experimental scenarios depends on many factors, such as the location of the test points, the layout of the WiFi devices, and that of the antennas. Moreover, the WiFi localization systems deployed often fail to provide the desired accuracy since performance prediction methods for WiFi CSI localization are still unavailable. For the above reasons, this study develops a performance prediction model for WiFi CSI localization that applies to diverse scenarios. Specifically, the study defines the error infinitesimal function between a pair of antennas on the basis of the basic physical CSI localization model. The error infinitesimal matrix and the corresponding heat map of localization performance are generated by analyzing the localization space. Then, multi-antenna fusion and multi-device fusion methods are adopted to extend the antenna pairs, thereby constructing a general performance prediction model for CSI localization. Finally, the study proposes integrating the abovementioned heat map with scenario maps to give due consideration to actual scenario maps and ultimately provide a customized performance prediction solution for a given scenario. In addition to the theoretical analysis, this study verifies the effectiveness of the proposed performance prediction model for localization with experimental data in two scenarios. The experimental results show that the actual localization accuracy is consistent with the proposed theoretical model in variation trend, and the model optimizes the localization accuracy by 32%–37%.

Measurement-device-independent Quantum Voting Scheme with Identity Authentication

KE Wei-Yang , SHI Run-Hua

2023, 34(11):5376-5391. DOI: 10.13328/j.cnki.jos.006738

Abstract (426) HTML (1226) PDF 5.06 M (1811) Comment (0) Favorites

Abstract:This study proposes a measurement-device-independent (MDI) quantum secure direct communication (QSDC) protocol with an identity authentication server to solve the problems concerning identity authentication and protocol feasibility during quantum communication and further puts forward a quantum voting scheme on the basis of the proposed MDI-QSDC protocol. This scheme takes advantage of various technologies, such as MDI quantum key distribution, perfect quantum encryption, and the classical one-time pad. In this way, it not only ensures its unconditional security in theory but also avoids the attack of the vulnerabilities of the measurement equipment by outside attackers in practice. Furthermore, this scheme takes the weak coherent pulses in the BB84 state as quantum resources and only performs single-particle operations and the measurements for identifying Bell states. As a result, this scheme is highly feasible for the present technologies. In addition, it extends the identity authentication function and enables the scrutineer to verify the integrity and correctness of voting information by adopting the Bit Commitment. Simulation results and analysis show that the proposed scheme is correct and has unconditional security in theory, i.e., information-theoretic security. Compared with the existing quantum voting schemes, the proposed scheme is more feasible.

Verifiable Dynamic Searchable Symmetric Encryption Based on Blockchain

XU Wan-Shan , ZHANG Jian-Biao , YUAN Yi-Lin

2023, 34(11):5392-5407. DOI: 10.13328/j.cnki.jos.006685

Abstract (826) HTML (1542) PDF 5.68 M (2661) Comment (0) Favorites

Abstract:Symmetric searchable encryption (SSE) can retrieve encrypted data without disclosing user privacy and has been widely studied and applied in cloud storage. However, in SSE schemes, semi-honest or dishonest servers may tamper with the data in files and return the untrusted files to users, so it is necessary to verify these files. Most existing verifiable SSE schemes are verified by the users locally, and malicious users may forge verification results, which cannot ensure verification fairness. To this end, this study proposes a verifiable dynamic symmetric searchable encryption scheme based on blockchain, VDSSE). VDSSE employs symmetric encryption to achieve forward security in the dynamic updating, and on this basis, the blockchain is utilized to verify the search results. During the verification, a new verification tag, Vtag, is proposed. The accumulation of Vtag is leveraged to compress the verification information, reduce the storage cost of verification information on the blockchain, and effectively support the dynamic verification of SSE schemes. Finally, experimental evaluation and security analysis are conducted on VDSSE to verify the feasibility and security of the scheme.

Secure Computation of Range and Sum of Extremums on Distributed Datasets

LI Shun-Dong , JIA Zhu-Liang , ZHAO Xue-Ling

2023, 34(11):5408-5423. DOI: 10.13328/j.cnki.jos.006737

Abstract (492) HTML (1206) PDF 5.80 M (1968) Comment (0) Favorites

Abstract:Due to the continuous breakthrough and development of information and communication technologies, information access has become convenient on the one hand. On the other hand, private information is now easier to leak than before. The combination of the intelligent field and secure multiparty computation (SMC) technology is expected to solve privacy protection problems. Although SMC has solved many different privacy protection problems so far, problems that remain to be settled are numerous. Research results about the SMC of range and the sum of extremums are currently seldom reported. As a common statistical tool, range and sum of extremums have been widely used in practice. Therefore, the secure computation of range and the sum of extremes are of great research significance. This study proposes a new encoding method and solves two types of SMC problems by the method: One is the secure computation of range, and the other is that of the sum of extremums. The new encoding method is combined with the Lifted ElGamal threshold cryptosystem to design a secure range computation protocol for distributed private datasets in the scenario in which multiple parties participate and each party has one data. Then, the new encoding method is slightly modified for the secure computation of the sum of extremums in the same scenario. On this basis, the study further modifies the new encoding method and combines it with the Paillier cryptosystem to design a protocol for the secure computation of range and the sum of extremums on distributed private datasets in the scenario in which two parties participate and each party has more than one data. Furthermore, this study proves that the proposed protocols are secure in the semi-honest model with the simulation paradigm. Finally, the complexities of these protocols are tested by simulation experiments. The results of the efficiency analysis and experiments show that the simple and efficient proposed protocols can be widely used in practical applications and are important tools for solving many other SMC problems.

Multi-party Threshold Private Set Intersection Protocol Based on Robust Secret Sharing

ZHANG En , QIN Lei-Yong , YANG Ren-Lin , LI Gong-Li

2023, 34(11):5424-5441. DOI: 10.13328/j.cnki.jos.006743

Abstract (631) HTML (1008) PDF 3.13 M (1974) Comment (0) Favorites

Abstract:A (t, n) threshold private set intersection protocol allows the N participants, each holding a private set of size n, to learn the intersection of their sets only if the number of the elements in their intersection is larger than or equal to the threshold t without revealing their private information. It has a wide range of applications, such as fingerprint recognition, online carpooling, and blind date websites. However, most of the existing research on threshold protocols for private set intersections focuses on two parties. The research on the threshold protocols for multi-party private set intersections is still faced with many challenges. For example, the existing threshold protocols for multi-party private set intersections use public key algorithms with large overheads, such as the fully homomorphic encryption algorithm, and such algorithms have not yet been effectively implemented. To solve the above problems, this study combines robust secret sharing and Bloom filters to propose two effective multi-party threshold private set intersection protocols and implements the protocols by simulation for the first time. Specifically, this study designs a new construction method for Bloom filters. The shares generated by robust secret sharing are corresponded to the elements in the sets of the participants. Whether the number of the elements in the intersection of the parties reaches the threshold is determined by whether correct secrets can be reconstructed from the secret sub-shares obtained by querying the Bloom filter. In this way, the protocols effectively prevent the revealing of the intersection cardinality. The first protocol designed in this study avoids public key algorithms with large overheads. When the security parameter λ is set to 128, the set size n is set to 2¹⁴ and the threshold is set to 0.8n, the time cost of the online phase of the protocols in the three-party scenario is 191 s. In addition, to resist the collusion of at most N – 1 adversaries in the semi-honest model, this study combines oblivious transfer with the first protocol to design a variant of the first protocol. Under the same condition, the time cost of the online phase is 194 s. Finally, the security proof conducted proves that the proposed protocols are secure in the semi-honest model.

Two Cloud-assisted Over-threshold Multi-party Private Set Intersection Calculation Protocol

WEI Li-Fei , LIU Ji-Hai , ZHANG Lei , NING Jian-Ting

2023, 34(11):5442-5456. DOI: 10.13328/j.cnki.jos.006747

Abstract (601) HTML (1327) PDF 6.65 M (2283) Comment (0) Favorites

Abstract:The over-threshold multi-party private set intersection (OT-MP-PSI) protocol is a variant of the conventional PSI protocol. This protocol allows m participants to jointly compute the OT intersection for which at least t (t≤m) participants have the common element and ensures that only the participant with the OT element knows whether the element belongs to the OT intersection and nothing else. The OT-MP-PSI protocol extends the practical application scenarios of the PSI protocol. As the existing schemes are all constructed on the basis of expensive public key-based cryptography, their heavy computational burden results in long runtime. This study designs a novel cryptographic component, the oblivious programmable pseudo-random secret-sharing (OPPR-SS) component based on symmetric cryptography. Furthermore, a two cloud-assisted OT-MP-PSI protocol is designed on the basis of the OPPR-SS component, and it assigns the tasks of secret sharing and reconstructing to untrusted cloud servers, respectively, so that they can assist in the completion of those tasks. As a result, participants with weak computation capability can complete the OT-MP-PSI protocol as well. Furthermore, the study proves that the proposed protocol is secure in the semi-honest model. Compared with the existing OT-MP-PSI protocols, the proposed protocol achieves the optimal runtime and communication overhead at both the secret sharing stage and the secret reconstructing stage. The communication complexities of the participants, the secret sharing cloud, and reconstructing cloud are no longer related to the threshold t. The number of communication rounds for the participants is constant, and the communication complexity is merely O(n). The computational complexities of the secret sharing cloud and the secret reconstructing cloud are only related to the number of symmetric cryptographic operations.

微信服务号

微信订阅号

>Review Articles

>Review Articles

>Review Articles

Current Issue

Volume

Issue