CHEN Wei-Yan , ZHANG Fu-Sang , LIU Jun-Jie , BAO Peng , ZHANG Da-Qing
2023, 34(12):5457-5476. DOI: 10.13328/j.cnki.jos.006760
Abstract:In recent years, the localization and tracking of moving targets have been widely used in scenes including indoor navigation, smart homes, security monitoring, and smart medical services. Radio frequency (RF)-based contactless localization and tracking have attracted extensive attention from researchers. Among them, the commercial IR-UWB-based technology can achieve target localization and tracking at low costs and power consumption and has strong development potential. However, most of the existing studies have the following problems: 1) Limited tracking scenes. Modeling and processing methods are only for outdoor or relatively empty indoor scenes under ideal conditions. 2) Limited movement states of targets and unduly ideal modeling. 3) Low tracking accuracy caused by fake moving targets. To solve these problems, this study proposes a moving target tracking method using IR-UWB on the basis of understanding the composition of the received signal spectrum in multipath scenes. First, the dynamic components of the originally received signal spectrum are extracted. Then, the Gaussian blur-based multipath elimination and distance extraction algorithm is employed to eliminate multipath interference, which only retains primary reflection information directly related to the moving target and therefore accurately obtains the distance variation curve of the target. Subsequently, a multi-view fusion algorithm is proposed to fuse the distance information of the devices from different views to achieve accurate localization and tracking of a single freely moving target. In addition, a real-time moving target tracking system based on the low-cost commercial IR-UWB radar is established. The experimental results in the real indoor home scene show that the error between the center position of the human body estimated by the system and the real motion trajectory is always within 20 cm. Moreover, the system remains robust even if influencing factors such as the experimental environment, experimenter, activity speed, and equipment height are altered.
BI Hong-Liang , CHEN Yan-Jiao , YI Xin-Jing , WANG Xu
2023, 34(12):5477-5500. DOI: 10.13328/j.cnki.jos.006798
Abstract:In recent years, with the rapid development of blockchain, the types of cryptocurrencies and anonymous transactions have been increasingly diversified. How to make optimal decisions in the transaction type of cryptocurrency market is the concern of users. The users’ decision-making goal is to minimize transaction costs and maximize privacy while ensuring that transactions are packaged. The cryptocurrency trading market is complex, and cryptocurrency technologies differ greatly from each other. Existing studies focus on the Bitcoin market, and few of them discuss other anonymous currency markets such as Zcash and users’ anonymous demands. Therefore, this study proposes a game-based general cryptocurrency trading market model and explores the trading market and users’ decisions on transaction types and costs by combining the anonymous needs of users and employing game theory. Taking Zcash, the most representative optional cryptocurrency, as an example, it analyzes the trading market in combination with the CoinJoin transaction, simulates the trading process about how users and miners find the optimal strategy, and discusses the impact of block size, discount factors, and the number of users on the trading market and user behaviors. Additionally, the model is simulated in a variety of market types to conduct in-depth discussion of the experimental results. Taking a three-type trading market as an example, in the context of vicious fee competition in the trading market, when plnum = 75, θ= 0.4, st = 100, sz = 400, all users are inclined to choose CoinJoin in the early transaction stage (first 500 rounds). In the middle and late part of the market (1500–2000 rounds), 97% of users with a privacy sensitivity below 0.7 tend to choose CoinJoin, while 73% of users with a privacy sensitivity above 0.7 tend to choose shielded transactions. CoinJoin transactions and block sizes above 400 can alleviate the vicious competition of transaction fees to some extent. The proposed model can help researchers understand the game of different cryptocurrency trading markets, analyze user trading behavior, and reveal market operation rules.
LIU Zhong-Xin , TANG Zhi-Jie , XIA Xin , LI Shan-Ping
2023, 34(12):5501-5526. DOI: 10.13328/j.cnki.jos.006749
Abstract:Code change is a kind of key behavior in software evolution, and its quality has a large impact on software quality. Modeling and representing code changes is the basis of many software engineering tasks, such as just-in-time defect prediction and recovery of software product traceability. The representation learning technologies for code changes have attracted extensive attention and have been applied to diverse applications in recent years. This type of technology targets at learning to represent the semantic information in code changes as low-dimensional dense real-valued vectors, namely, learning the distributed representation of code changes. Compared with the conventional methods of manually designing code change features, such technologies offers the advantages of automatic learning, end-to-end training, and accurate representation. However, this field is still faced with some challenges, such as great difficulties in utilizing structural information and the absence of benchmark datasets. This study surveys and summarizes the recent progress of studies and applications of representation learning technologies for code changes, and it mainly consists of the following four parts. (1) The study presents the general framework of representation learning of code changes and its application. (2) Subsequently, it reviews the currently available representation learning technologies for code changes and summarizes their respective advantages and disadvantages. (3) Then, the downstream applications of such technologies are summarized and classified. (4) Finally, this study discusses the challenges and potential opportunities ahead of representation learning technologies for code changes and suggests the directions for the future development of this type of technology.
WU Yi-Wen , ZHANG Yang , WANG Tao , WANG Huai-Min
2023, 34(12):5527-5551. DOI: 10.13328/j.cnki.jos.006765
Abstract:In recent years, software construction, operation, and evolution have encountered many new requirements, such as the need for efficient switching or configuration in development and testing environments, application isolation, resource consumption reduction, and higher efficiency of testing and deployment. These requirements pose great challenges to developers in developing and maintaining software. Container technology has the potential of releasing developers from the heavy workload of development and maintenance. Of particular note, Docker, as the de facto industrial standard for containers, has recently become a popular research area in the academic community. To help researchers understand the status and trends of research on Docker containers, this study conducts a systematic literature review by collecting 75 high-level papers in this field. First, quantitative methods are used to investigate the basic status of research on Docker containers, including research quantity, research quality, research areas, and research methods. Second, the first classification framework for research on Docker containers is presented in this study, and the current studies are systematically classified and reviewed from the dimensions of the core, platform, and support. Finally, the development trends of Docker container technology are discussed, and seven future research directions are summarized.
CHEN Xiao , XIAO Fu , SHA Le-Tian , WANG Zhong , DI Wei-He
2023, 34(12):5552-5577. DOI: 10.13328/j.cnki.jos.006763
Abstract:The emergence of the dynamic link library (DLL) provides great convenience for developers, which improves the interaction between the operating system (OS) and applications. However, the potential security problems of DLL cannot be ignored. Determining how to mine DLL-hijacking vulnerabilities during the running of Windows installers is important to ensure the security of Windows OS. In this paper, the attribute features of numerous installers are collected and extracted, and the double-layer bi-directional long short-term memory (BiLSTM) neural network is applied for machine learning from the perspectives of installers, the invocation modes of DLL from installers, and the DLL file itself. The multi-dimensional features of the vulnerability data set are extracted, and unknown DLL-hijacking vulnerabilities are mined. In experiments, DLL-hijacking vulnerabilities can be effectively detected from Windows installers, and 10 unknown vulnerabilities are discovered and assigned CNVD authorizations. In addition, the effectiveness and integrity of this method are further verified by comparison with other vulnerability analyzers.
GAO Yu , WANG Dong , DAI Qian-Wang , DOU Wen-Sheng , WEI Jun
2023, 34(12):5578-5596. DOI: 10.13328/j.cnki.jos.006755
Abstract:The critical reliability and availability of distributed systems are threatened by crash recovery bugs caused by incorrect crash recovery mechanisms and their implementations. The detection of crash recovery bugs, however, can be extremely challenging since these bugs only manifest themselves when a node crashes under special timing conditions. This study presents a novel approach Deminer to automatically detect crash recovery bugs in distributed systems. Observations in the large-scale distributed systems show that node crashes that interrupt the execution of related I/O write operations, which store a piece of data (i.e., common data) in different places, e.g., different storage paths or nodes, are more likely to trigger crash recovery bugs. Therefore, Deminer detects crash recovery bugs by automatically identifying and injecting such error-prone node crashes under the usage guidance of common data. Deminer first tracks the usage of critical data in a correct run. Then, it identifies I/O write operation pairs that use the common data and predicts error-prone injection points of a node crash on the basis of the execution trace. Finally, Deminer tests the predicted injection points of the node crash and checks failure symptoms to expose and confirm crash recovery bugs. A prototype of Deminer is implemented and evaluated on the latest versions of four widely used distributed systems, i.e., ZooKeeper, HBase, YARN, and HDFS. The experimental results show that Deminer is effective in finding crash recovery bugs. Deminer has detected six crash recovery bugs.
TIAN Qing , CHU Yi , SUN He-Yang , WU Yi-Xin , CHEN Song-Can
2023, 34(12):5597-5613. DOI: 10.13328/j.cnki.jos.006806
Abstract:By transferring the knowledge of the source domain to the target domain with similar tasks, domain adaptation aims to assist the latter to learn better. When the data label set of the target domain is a subset of the source domain labels, the domain adaptation of this type of scenario is called partial domain adaptation (PDA). Compared with general domain adaptation, although PDA is more general, it is more challenging with few related studies, especially with the lack of systematic reviews. To fill this gap, this study conducts a comprehensive review, analysis and summary of existing PDA methods, and provides an overview and reference of subject research for the relevant community. Firstly, an overview of the PDA background, concepts, and application fields is summarized. Secondly, according to the modeling characteristics, existing PDA methods are divided into two categories: promoting positive transfer and alleviating negative transfer, and this study reviews and analyzes them respectively. Then, the commonly used experimental benchmark datasets are categorized and summarized. Finally, the problems in existing PDA studies are analyzed to point out possible future development directions.
CHEN Yue-He , JIA Yong-Hui , TAN Chuan-Yuan , CHEN Wen-Liang , ZHANG Min
2023, 34(12):5614-5628. DOI: 10.13328/j.cnki.jos.006799
Abstract:Several methods have been proposed to address complex questions of knowledge base question answering (KBQA). However, the complex semantic composition and the possible absence of inference paths lead to the poor reasoning effect of complex questions. To this end, this study proposes the CGL-KBQA method based on the global and local features of knowledge graphs. The method employs the knowledge embedding technique to extract the topological structure and semantic features of knowledge graphs as the global features of the candidate entity node, and models the complex questions as a composite triple classification task based on the entity representation and question composition. At the same time, the core inference paths generated by graphs during the search process are utilized as local features, which are then combined with the semantic similarity of questions to construct different dimensional features of the candidate entities and finally form a hybrid feature scorer. Since the final inference paths may be missing, this study also designs a cluster module with unsupervised multi-clustering methods to select final answer clusters directly according to the feature representation of candidate entities, thereby making reasoning under incomplete KG possible. Experimental results show that the proposed method performs well on two common KBQA datasets, especially when KG is incomplete.
ZHANG Qing-Hua , ZHOU Jing-Peng , DAI Yong-Yang , WANG Guo-Yin
2023, 34(12):5629-5648. DOI: 10.13328/j.cnki.jos.006756
Abstract:Density peaks clustering (DPC) is a density-based clustering algorithm that can intuitively determine the number of clusters, identify clusters of any shape, and automatically detect and exclude abnormal points. However, DPC still has some shortcomings: The DPC algorithm only considers the global distribution, and the clustering performance is poor for datasets with large cluster density differences. In addition, the point allocation strategy of DPC is likely to cause a Domino effect. Hence, this study proposes a DPC algorithm based on representative points and K-nearest neighbors (KNN), namely, RKNN-DPC. First, the KNN density is constructed, and the representative points are introduced to describe the global distribution of samples and propose a new local density. Then, the KNN information of samples is used to propose a weighted KNN allocation strategy to relieve the Domino effect. Finally, a comparative experiment is conducted with five clustering algorithms on artificial datasets and real datasets. The experimental results show that the RKNN-DPC algorithm can more accurately identify cluster centers and obtain better clustering results.
JIANG Xiao-Bo , HE Kun , YAN Guang-Yu
2023, 34(12):5649-5669. DOI: 10.13328/j.cnki.jos.006750
Abstract:Entity recognition is a key task of information extraction. With the development of information extraction technology, researchers turn the research direction from the recognition of simple entities to the recognition of complex ones. Complex entities usually have no explicit features, and they are more complicated in syntactic constructions and parts of speech, which makes the recognition of complex entities a great challenge. In addition, existing models widely use span-based methods to identify nested entities. As a result, they always have an ambiguity in the detection of entity boundaries, which affects recognition performance. In response to the above challenge and problem, this study proposes an entity recognition model GIA-2DPE based on prior semantic knowledge and type embedding. The model uses keyword sequences of entity categories as prior semantic knowledge to improve the cognition of entities, utilizes type embedding to capture potential features of different entity types, and then combines prior knowledge with entity-type features through the gated interactive attention mechanism to assist in the recognition of complex entities. Moreover, the model uses 2D probability encoding to predict entity boundaries and combines boundary features and contextual features to enhance accurate boundary detection, thereby improving the performance of nested entity recognition. This study conducts extensive experiments on seven English datasets and two Chinese datasets. The results show that GIA-2DPE outperforms state-of-the-art models and achieves a 10.4% F1 boost compared with the baseline in entity recognition tasks on the ScienceIE dataset.
DU Xiao-Yu , CHEN Zheng , XIANG Xin-Guang
2023, 34(12):5670-5685. DOI: 10.13328/j.cnki.jos.006754
Abstract:Tag-aware recommendation algorithms use tagged data to enhance the recommendation models’ understanding of user preferences and item attributes, which attract extensive attention in the field. Most existing methods, however, neglect the diversities of user concerns, item attributes, and tag semantics and interfere with the correlation inference of the three, which affects the recommendation results. Therefore, this study introduces the disentangled graph neural network (DGNN) method into the tag-aware recommendation task and proposes a DGNN-based explainable tag-aware recommendation (DETRec) method. It disentangles the perspectives of users, items, and tags to provide explainable recommendation references. Specifically, DETRec utilizes a correlation graph construction module to model the user-item-tag correlations. Then, it employs a neighborhood routing mechanism and a message propagation mechanism to disentangle the nodes to form the sub-graphs of attributes and thereby describe the nodal correlations under different attributes. Finally, it generates recommendation references on the basis of these attribute sub-graphs. This study implements two types of DETRec instantiation: 1) DETRec based on a single graph (DETRec-S), which describes all correlations of user, item, and tag nodes in a single graph, and 2) DETRec based on multiple graphs (DETRec-M), which utilizes three bipartite graphs to describe the user-item, item-tag, and user-tag correlations separately. Extensive experiments on three public datasets demonstrate that the above two types of DETRec instantiation are significantly superior to the baseline model and generate the references corresponding to the recommendation results. Hence, DETRec is an effective explainable tag-aware recommendation algorithm.
LI Fang-Fang , SU Pu-Zhen , DUAN Jun-Wen , ZHANG Shi-Chao , MAO Xing-Liang
2023, 34(12):5686-5703. DOI: 10.13328/j.cnki.jos.006802
Abstract:Multi-label text classification methods based on deep learning lack multi-granularity learning of text information and the utilization of constraint relations between labels. To solve these problems, this study proposes a multi-label text classification method with enhancing multi-granularity information relations. First, this method embeds text and labels in the same space by joint embedding and employs the BERT pre-trained model to obtain the implicit vector feature representation of text and labels. Then, three multi-granularity information relations enhancing modules including document-level information shallow label attention (DISLA) classification module, word-level information deep label attention (WIDLA) classification module, and label constraint relation matching auxiliary module are constructed. The first two modules carry out multi-granularity learning from shared feature representation: the shallow interactive learning between document-level text information and label information, and the deep interactive learning between word-level text information and label information. The auxiliary module improves the classification performance by learning the relation between labels. Finally, the comparison with current mainstream multi-label text classification algorithms on three representative datasets shows that the proposed method achieves the best performance on main indicators of Micro-F1, Macro-F1, nDCG@k, and P@k.
SONG Guang-Hui , GUO Shao-Zhong , ZHAO Jie , TAO Xiao-Han , LI Fei , XU Jin-Chen
2023, 34(12):5704-5723. DOI: 10.13328/j.cnki.jos.006757
Abstract:Mixed precision has made many advances in deep learning and precision tuning and optimization. Extensive research shows that mixed precision optimization for stencil computation is challenging. Moreover, the research achievements secured by the polyhedral model in the field of automatic parallelization indicate that the model provides a good mathematical abstraction for loop nesting, on the basis of which loop transformations can be performed. This study designs and implements an automatic mixed precision optimizer for Stencil computation on the basis of polyhedral compilation technology. By performing iterative domain partitioning, data flow analysis, and scheduling tree transformation on the intermediate representation layers, this study implements the source-to-source automatic generation of mixed precision codes for Stencil computation for the first time. The experiments demonstrate that the code after automatic mixed precision optimization can give full play to its parallelism potential and improve the performance of the program by reducing precision redundancy. With high-precision computing as the benchmark, the maximum speedup is 1.76, and the geometric average speedup is 1.15 on the x86 architecture; on the new-generation Sunway architecture, the maximum speedup is 1.64, and the geometric average speedup is 1.20.
MAO Xing-Liang , CHEN Xiao-Hong , NING Ken , LI Fang-Fang , ZHANG Shi-Chao
2023, 34(12):5724-5736. DOI: 10.13328/j.cnki.jos.006903
Abstract:One of the main challenges in judicial artificial intelligence is the identification of key case elements. The existing methods only take the identification of case elements as an identification task of named entities, and thus, the recognized information is mostly irrelevant. In addition, due to the lack of effective use of global and local information in texts, the effect of element boundary recognition is poor. To solve these problems, this study proposes a recognition method of key case elements by integrating global and local information. Specifically, the BERT model is used as the input-sharing layer of judicial texts to extract text features. Then, three sub-task networks of judicial case element recognition, judicial text classification (global information), and judicial Chinese word segmentation (local information) are established on the sharing layer for joint learning. Finally, the effectiveness of this method is tested on two public data sets. The results show that the F1 value of the proposed method exceeds the existing optimal method, improves the classification accuracy of element entities, and reduces boundary recognition errors.
ZHANG Fei-Fei , GE Ji-Dong , LI Zhong-Jin , HUANG Zi-Feng , ZHANG Sheng , CHEN Xing-Guo , LUO Bin
2023, 34(12):5737-5756. DOI: 10.13328/j.cnki.jos.006797
Abstract:In edge computing scenarios, some tasks to be performed will be offloaded to the edge server, which can reduce the load of mobile devices, enhance the performance of mobile applications, and lower the cost of mobile devices. For delay-sensitive tasks, it is critical to ensure they are completed within the deadlines. However, the limited resource of edge servers results in the fact that when data transmission and task processing from multiple devices are received at the same time, some tasks have to wait in queue before they are scheduled. As a result, the long waiting time may cause time-out failure, which will also make it impossible to balance the performance goals of several devices. Therefore, this study optimizes the task scheduling order on the edge server based on computation offloading. Firstly, the task scheduling is modeled as a long-term optimization issue, and the online learning method based on a combination multi-arm bandit is employed to dynamically adjust the scheduling order of the server. Secondly, the dynamically changing order of task execution will lead to different levels of performance enhancement for task offloading, which will influence the validity of offloading decisions. The deep-Q learning method with perturbed reward is adopted to determine the execution sites for tasks to improve the robustness of offloading strategies. Simulation results show that the proposed strategy can balance multiple user objectives and lower the system cost simultaneously.
WANG Ying , CHEN Ke , YU Peng , LI Wen-Jing , QIU Xue-Song , MENG Luo-Ming
2023, 34(12):5757-5772. DOI: 10.13328/j.cnki.jos.006780
Abstract:Core network slicing achieves flexible networking by combining virtualized network functions (VNFs). However, the failure of any VNF due to software and hardware failures will cause an interruption of the slice service. Since network slices share resources, a specific isolation mechanism is required to meet slice robustness demands. Most of the existing availability guarantee mechanisms focus on random VNF failures, and some of them involving external attacks rarely consider special isolation requirements of network slices. To realize slice availability guarantee under isolation mechanisms, this study proposes a method to guarantee network slice availability based on multi-level isolation. First, an availability guarantee model of core network resource awareness is built to meet the isolation requirements with consuming the least number of backup resources. Then, an isolation level assessment model is proposed to evaluate the isolation level of VNFs. Finally, a multi-level isolated backup algorithm (MLIBA) is proposed to solve the availability guarantee problem. In addition, an equivalent backup instance-based calculation method is put forward to address the PP-complete problem of availability calculation for a shared backup. Simulation results show that the proposed availability calculation method has high accuracy, and the introduction of multi-level isolation can double the robustness of slices. The comparison with existing studies shows that under the same isolation constraints and availability targets, the proposed method can reduce resource consumption by 20%–70% and increase the proportion of effective resources by 5%–30%.
QIN Chuan , DONG Teng-Lin , YAO Heng
2023, 34(12):5773-5786. DOI: 10.13328/j.cnki.jos.006752
Abstract:Most traditional information hiding methods embed secret data by modifying cover data, which inevitably leaves traces of modification in cover data, and hence, it is difficult to resist the detection of the existing steganalysis algorithms. Consequently, the technique of coverless information hiding emerges, which hides secret data without modifying cover data. To improve the hiding capacity and robustness of coverless information hiding, this study proposes a constructive data hiding method based on texture synthesis and recognition with image style transfer. Firstly, natural images and texture images of different categories are used to construct the content image database and the textural style image database, respectively. A mapping dictionary of binary codes is established according to the categories of natural images in the content image database. Secondly, the labeled textural image database should be constructed and input into the convolutional neural network as a training dataset, and the texture image recognition model can be obtained by iterative training. In this way, the secret data can be extracted from stego images at the receiving end. During secret data hiding, natural images are selected from the content image database according to to-be-embedded secret data fragments, which are synthesized to form a stego mosaic image. Then, a texture image is randomly selected from the textural style image database, and the stego texture image can be generated by the selected texture image and the stego mosaic image with the strategy of style transfer to achieve secret data hiding. During secret data extraction, the obtained texture image recognition model can accurately identify the original categories of stego texture images corresponding to natural images, and secret data can be finally extracted by reference to the mapping dictionary. The experimental results demonstrate that the proposed method can achieve the stego texture image with a satisfactory visual effect and a high hiding capacity, and it illustrates strong robustness to attacks such as JPEG compression and Gaussian noise.
CHEN Jie , XU Chun-Xiang , ZHANG Yuan , JIANG Chang-Song , HAN Yun-Xia , CAO Chen-Chen
2023, 34(12):5787-5806. DOI: 10.13328/j.cnki.jos.006759
Abstract:The graphical password mitigates the burden of memorizing traditional textual passwords and simplifies the process of entering passwords, which has been widely applied to user authentication of mobile devices in recent years. Existing graphical password authentication schemes face critical threats. First, graphical passwords are vulnerable to shoulder-surfing attacks, namely that users’ graphical passwords may be leaked if attackers capture their login information through eyes or cameras. More seriously, these schemes are subject to credential leakage attacks. In other words, as the server stores authentication credentials related to the graphical passwords of users to verify their identities, if attackers obtain these credentials, they can perform offline password guessing attacks to retrieve users’ graphical passwords. To solve the above problems, this study proposes a secure graphical password authentication scheme, dubbed GADL. GADL embeds random challenge values into the graphical passwords of users to resist shoulder-surfing attacks, and thus attackers cannot obtain users’ passwords even if they capture their login information. To address credential database leakage of the server, GADL utilizes a deterministic threshold blind signature technique to protect users’ graphical passwords. In this technique, multiple key servers are utilized to assist users in the credential generation, which ensures that attackers cannot perform offline guessing attacks to obtain any knowledge of users’ passwords even if they obtain users’ credentials. The security analysis given in this study proves that GADL is resistant to the aforementioned attacks. In addition, the comprehensive performance evaluation of GADL demonstrates its high performance in terms of computation, storage, and communication overhead and proves that it can be easily deployed on mobile devices.
SONG Chan , ZHANG Lei , WU Wen-Ling
2023, 34(12):5807-5821. DOI: 10.13328/j.cnki.jos.006761
Abstract:SPN construction is the most widely used overall construction of block ciphers at present, which is adopted by block ciphers such as AES and ARIA. The security analysis of SPN ciphers is a research hotspot in cryptanalysis. The application of the subspace trail cryptanalysis to the typical two-dimensional SPN ciphers and typical three-dimensional SPN ciphers can yield the corresponding subspace trails and general properties based on the subspace trails separately. These properties are independent of the secret key and the detailed definitions of the S-box and MixColumns matrix. They can be specifically described as follows: For a typical two-dimensional SPN cipher whose state can be formalized into a two-dimensional array of n×m, the number of different ciphertext pairs belonging to the same coset of the mixed subspace in the ciphertexts obtained by five rounds of encryption of all plaintexts belonging to the same coset of the quasi-diagonal subspace must be a multiple of 2n–1. For a typical three-dimensional SPN cipher whose state can be formalized into a three-dimensional array of l×n×m, the number of different ciphertext pairs belonging to the same coset of the mixed subspace in the ciphertexts obtained by seven rounds of encryption of all plaintexts belonging to the same coset of the quasi-diagonal subspace must be a multiple of 2nl–1. In addition, this study not only proves these properties but also makes experimental verification on the internal permutations of PHOTON and small-scale variants of Rijndael, 3D, and Saturnin algorithms. The experimental results are completely consistent with these properties.
CAO Jian-Jun , NIE Zi-Bo , ZHENG Qi-Bin , Lü Guo-Jun , ZENG Zhi-Xian
2023, 34(12):5822-5847. DOI: 10.13328/j.cnki.jos.006764
Abstract:Entity resolution widely exists in data tasks such as data quality control, information retrieval, and data integration. Traditional entity resolution methods mainly focus on relational data, while with the development of big data technology, the application requirements of cross-modal data are generated due to the proliferation of different modal data including texts and images. Hence, cross-modal data entity resolution has become a fundamental problem in big data processing and analysis. In this study, the research development of cross-modal entity resolution is reviewed, and its definition and evaluation indexes are introduced. Then, with the construction of inter-modal relationships and the maintenance of intra-modal relationships as the main line, existing research results are surveyed. In addition, widely used methods are tested on different open datasets, and their differences and reasons behind them are analyzed. Finally, the problems in the present research are concluded, on the basis of which the future research trends are given.
2023, 34(12):5848-5861. DOI: 10.13328/j.cnki.jos.006803
Abstract:As a new technology that combines reversible data hiding and fragile watermarking, image reversible authentication (RA) can not only realize the fragile authentication of images but also recover the original carrier image without distortion while extracting the authentication code. Thus, it is of great significance to authenticate the originality and integrity of images. Existing reversible authentication methods have low authentication accuracy and cannot effectively protect images with complex textures or some areas with complex textures in the images. To this end, this study proposes a new reversible authentication method. Firstly, images to be authenticated are divided into blocks, and the obtained sub-blocks are classified as differential blocks (DB) and shifting blocks (SB) according to their embedding capacity. Different reversible embedding methods are employed to embed the authentication codes into different types of blocks. It also adopts a hierarchical embedding strategy to increase embedding capacity and improve the authentication effects of each sub-block. On the authentication side, tamper detection and localization can be realized by the authentication code extracted from each sub-block. In addition, this method can be combined with dilation and corrosion in morphology to refine tamper detection marks and further improve the detection accuracy rate. Experimental results show that the proposed method can protect images with smooth texture and complex texture under the same authentication accuracy, and can also realize independent authentication and restoration of almost all sub-blocks, which has widespread applicability.
GAO He-Ran , WU Heng , XU Yuan-Jia , LI Xiu-He , WANG Tao , ZHANG Wen-Bo
2023, 34(12):5862-5886. DOI: 10.13328/j.cnki.jos.006800
Abstract:With the rapid growth and further application of deep learning (DL), the scale of DL training continues to expand, and memory insufficiency has become one of the major bottlenecks threatening DL availability. Memory swapping mechanism is the key mechanism to alleviate the memory problem of DL training. This mechanism leverages the “time-varying” memory requirement of DL training and moves the data between specific computing accelerating device memory and external storage according to demands. The operation of DL training tasks can be ensured by replacing an accumulated memory requirement with an instant one. This study surveys the memory swapping mechanism for DL training from the aspect of time-varying memory requirements. Key studies of an operator feature-based memory swapping-out mechanism, a data dependency based swapping-in mechanism, and efficiency-driven joint swapping-in and swapping-out decisions are summarized. Finally, the development prospect of this technology is pointed out.
WEN Ying-Ying , CHENG Guan-Jie , DENG Shui-Guang , YIN Jian-Wei
2023, 34(12):5887-5904. DOI: 10.13328/j.cnki.jos.006779
Abstract:With the development of cloud computing and service architectures including software as a service (SaaS) and function as a service (FaaS), data centers, as the service provider, constantly face resource management. The quality of service (QoS) should be guaranteed, and the resource cost should be controlled. Therefore, a method to accurately measure computing power consumption becomes a key research issue for improving resource utilization and keeping the load pressure in the acceptable range. Due to mature virtualization technologies and developing parallel technologies, the traditional estimation metric CPU utilization fails to address interference caused by resource competition, thus leading to accuracy loss. However, the hyper-threading (HT) technology is employed as the main data center processor, which makes it urgent to estimate the computing power of HT processors. To address this estimation challenge, this study proposes the APU method to estimate the computing power consumption for HT processors based on the understanding of the HT running mechanism and thread behavior modeling. Considering that users with different authorities can access different system levels, two implementation schemes are put forward: one based on the hardware support and the other based on the operating system (OS). The proposed method adopts CPU utilization as the input without demands for other dimensions. Additionally, it reduces the development and deployment costs of new monitoring tools without the support of special hardware architectures, thereby making the method universal and easy to apply. Finally, SPEC benchmarks further prove the effectiveness of the method. The estimation errors of the three benchmarks are reduced from 20%, 50%, and 20% to less than 5%. For further proving the applicability, the APU method is leveraged to ByteDance clusters for showing its effects in case studies.
LI Guan-Bin , ZHANG Rui-Fei , LIU Meng-Meng , LIU Jing , LIN Liang
2023, 34(12):5905-5920. DOI: 10.13328/j.cnki.jos.006736
Abstract:Video description technology aims to automatically generate textual descriptions with rich content for videos, and it has attracted extensive research interest in recent years. An accurate and elaborate method of video description generation not only should have achieved a global understanding of the video but also depends heavily on the local spatial and time-series features of specific salient objects. How to model a better video feature representation has always been an important but difficult part of video description tasks. In addition, most of the existing work regards a sentence as a chain structure and views a video description task as a process of generating a sequence of words, ignoring the semantic structure of the sentence. Consequently, the currently available algorithms are unable to handle and optimize complex sentence descriptions or avoid logical errors commonly seen in the long sentences generated. To tackle the problems mentioned above, this study proposes a novel generation method for interpretable video descriptions guided by language structure. Due consideration is given to both local object information and the semantic structure of the sentence by designing an attention-based structured tubelet localization mechanism. When it is incorporated with the parse tree constructed from sentences, the proposed method can adaptively attend to corresponding spatial-temporalfeatures with textual contents and further improve the performance of video description generation. Experimental results on mainstream benchmark datasets of video description tasks, i.e., Microsoft research video captioning corpus (MSVD) and Microsoft research video to text (MSR-VTT), show that the proposed approach achieves state-of-the-art performance on most of the evaluation metrics.
LI Hao , GU Jin-Yu , XIA Yu-Bin , ZANG Bin-Yu , CHEN Hai-Bo
2023, 34(12):5921-5939. DOI: 10.13328/j.cnki.jos.006762
Abstract:The extended Berkeley packet filter (eBPF) mechanism in the Linux kernel can safely load user-provided untrusted programs into the kernel. In the eBPF mechanism, the verifier checks these programs and ensures that they will not cause the kernel to crash or access the kernel address space maliciously. In recent years, the eBPF mechanism has developed rapidly, and its verifier has become more complex as more and more new features are added. This study observes two problems of the complex eBPF verifier. One is the “false negative” problem: There are many bugs in the complex security check logic of the verifier, and attackers can leverage these bugs to design malicious eBPF programs that can pass the verifier to attack the kernel. The other is the “false positive” problem: Since the verifier adopts the static check method, only conservative checks can be performed due to the lack of runtime information. This may cause the originally safe program to fail the check of the verifier and only support limited semantics, which brings difficulties to the development of eBPF programs. Further analysis shows that the static simulation execution check mechanism in the eBPF verifier features massive codes, high complexity, and conservative analysis, which are the main reasons for security vulnerabilities and false positives. Therefore, this study proposes to replace the static simulation execution check mechanism in the eBPF verifier with a lightweight dynamic check method. The bugs and conservative checks that originally existed in the eBPF verifier due to simulation execution no longer exist, and hence, the above-mentioned “false negative” and “false positive” problems can be eliminated. Specifically, the eBPF program is run in a kernel sandbox, which dynamically checks the memory access of the program in the runtime to prevent it from accessing the kernel memory illegally. For efficient implementation of a lightweight kernel sandbox, the Intel protection keys for supervisor (PKS), a new hardware feature, is used to perform a zero-overhead memory access check, and an efficient interaction method between the kernel and the eBPF program in the sandbox is presented. The evaluation results show that this study can eliminate memory security vulnerabilities of the eBPF verifier (this type of vulnerability has accounted for more than 60% of the total vulnerabilities of the eBPF verifier since 2020). Moreover, in the scenario of high-throughput network packet processing, the performance overhead brought by the lightweight kernel sandbox is lower than 3%.
CHEN Yu , ZHANG Sheng , JIN Yi-Bo , QIAN Zhu-Zhong , LU Sang-Lu
2023, 34(12):5940-5956. DOI: 10.13328/j.cnki.jos.006745
Abstract:In the past decade or so, artificial intelligence-related services and applications have boomed, and they require high computing power, high bandwidth, and low latency. Edge computing is currently regarded as one of the most appropriate solutions for such applications, especially for video analysis-related ones. This study investigates multi-server multi-user heterogeneous video analysis task offloading, where users select appropriate edge servers and then upload their raw video data to the servers for video analysis. It models the issue of multi-server multi-user heterogeneous video analysis task offloading as a multiplayer game issue. The aim is to effectively deal with the competition for and sharing of the limited network resources among the numerous users and achieve a stable network resource allocation situation where each user has no incentive to change their task offloading decision unilaterally. With the optimization goal of minimizing the overall delay, this study successively investigates the non-distributed and distributed video analysis scenarios and proposes the game theory-based algorithms of potential optimal server selection and video unit allocation accordingly. Rigorous mathematical proof reveals that Nash equilibrium can be reached by the proposed algorithms in both of the two cases, and a low overall delay is guaranteed. Finally, extensive experiments on actual datasets show that the proposed methods reduce the overall delay by 26.3% on average, compared with that of other currently available algorithms.