Volume 33,Issue 11,2022 Table of Contents

Research and Improvements on Crow Search Algorithm for Feature Selection

2022, 33(11):3903-3916. DOI: 10.13328/j.cnki.jos.006327

Abstract (1794) HTML (1850) PDF 1.81 M (2777) Comment (0) Favorites

Abstract:Feature selection is a hot issue in the field of machine learning. Meta-heuristic algorithm is one of the important methods of feature selection, and its performance will have a direct impact on problem solving. Crow search algorithm (CSA) is a kind of meta-heuristic algorithm inspired by the behavior of crow intelligent group. Because of its simple and efficient characteristics, it is used by many scholars to solve the feature selection problem. However, CSA is easy to fall into a local optimal solution and the convergence speed is slow, which severely limits the algorithm's solving ability. In response to this problem, this study uses three operators, namely, logistic chaotic mapping, opposition-based learning method, and differential evolution, combined with crow search algorithm, proposes a feature selection algorithm BICSA to select the optimal feature subset. In the experimental phase, the performance of BICSA was demonstrated by using 16 data sets in the UCI database. Experimental results show that compared with other feature selection algorithms, the feature subset obtained by BICSA has higher classification accuracy and higher dimensional compression capabilities, indicating that BICSA has the ability to deal with feature selection problems with strong competitiveness and sufficient superiority.

Sampling-based Lattice Reduction Algorithm for Subset Sum Problem

CAO Jin-Zheng , CHENG Qing-Feng , SHI Wen-Bo , LU Ning

2022, 33(11):3917-3929. DOI: 10.13328/j.cnki.jos.006328

Abstract (843) HTML (2206) PDF 1.65 M (2380) Comment (0) Favorites

Abstract:The subset sum problem is an important problem in computer science and the basis for constructing many public key cryptosystems. A random sampling method is proposed, so the dimension of the problem can be reduced by decomposing the original problem into multiple smaller subsets sum problems, reducing the radius of the constructed lattice, thereby improving the efficiency of solving SVP, and then reaching the solution. The theoretically worst-case success rate of the algorithm is given, and a possible method to improve the success rate of the algorithm is given based on the 0-1 distribution of the solution vector, when the weight of target solution is low, the solution vector is divided into sectors, thus applying restrictive conditions to the problem to improve the efficiency of problem-solving. Experimental results show that for high-dimensional subsets and problems, compared with the existing lattice reduction subsets and problem methods such as CJLOSS, the proposed algorithm can solve the exact solution of the problem more efficiently, and can improve the approximation degree of the approximate solution, the average length of the output approximate solution is 0.55 times that of the CJLOSS algorithm and 0.64 times that of the DR algorithm.

Code Completion Approach Based on Combination of Syntax and Semantics

FU Shan-Qing , LI Zheng , ZHAO Rui-Lian , GUO Jun-Xia

2022, 33(11):3930-3943. DOI: 10.13328/j.cnki.jos.006324

Abstract (1982) HTML (1418) PDF 1.84 M (2234) Comment (0) Favorites

Abstract:In the field of software engineering, code completion is one of the most useful technologies in the integrated development environment (IDE). It improves the efficiency of software development and becomes an important technology to accelerate the development of modern software. Prediction of class names, method names, keywords, and so on, through code completion technology, to a certain extent, improves code specifications and reduces the work intensity of programmers. In recent years, the development of artificial intelligence promotes the development of code completion. In general, smart code completion uses the source code training network to learn code characteristics from the corpus, and makes recommendations and predictions based on the context code characteristics of the locations to be completed. Most of the existing code feature representations are based on program grammar and do not reflect the semantic information of the program. The network structure currently used is still not capable of solving long-distance dependency problems when facing long code sequences. Therefore, this study proposes a method to characterize codes based on program control dependency and grammar information, and considers code completion as an abstract grammar tree (AST) node prediction problem based on time convolution network (TCN). This network models can learn the grammar and semantic information of the program better, and can capture longer-range of dependencies. This method has been proven to be about 2.8% more accurate than existing methods.

Energy Efficient Hybrid Swarm Intelligence Virtual Machine Consolidation Method

LI Jun-Qi , LIN Wei-Wei , SHI Fang , LI Ke-Qin

2022, 33(11):3944-3966. DOI: 10.13328/j.cnki.jos.006330

Abstract (1304) HTML (1296) PDF 3.01 M (3622) Comment (0) Favorites

Abstract:Virtual machine (VM) consolidation for cloud data centers is one of the hottest research topics in cloud computing. It is challenging to minimize the energy consumption while ensuring QoS of the hosts in cloud data centers, which is essentially an NP-hard multi-objective optimization problem. This study proposes an energy efficient hybrid swarm intelligence virtual machine consolidation method (HSI-VMC) for heterogeneous cloud environments to address this issue, which including peak efficiency based static threshold overloaded hosts detection strategy (PEBST), migration ratio based reallocate virtual machine selection strategy (MRB), target host selection strategy, hybrid discrete heuristic differential evolutionary particle swarm optimization virtual machine placement algorithm (HDH-DEPSO) and load average based underloaded hosts processing strategy (AVG). Specifically, the combination of PEBST, MRB, and AVG is able to detect the overloaded and underloaded hosts and selects appropriate virtual machines for migration to reduce SLAV and virtual machine migrations. Also, HDH-DEPSO combines the advantages of DE and PSO to search the best virtual machine placement solution, which can reduce cluster's real-time power effectively. A series of experiments based on real cloud environment datasets (PlanetLab, Mix, and Gan) show that HSI-VMC can reduce energy consumption sharply with accommodate to multiple QoS metrics, outperforms several existing mainstream energy-aware virtual machine consolidation approaches.

Guiding Directed Grey-box Fuzzing by Target-oriented Valid Coverage

YANG Ke , HE Ye-Ping , MA Heng-Tai , CAI Chun-Fang , XIE Yi , DONG Ke

2022, 33(11):3967-3982. DOI: 10.13328/j.cnki.jos.006331

Abstract (1817) HTML (2370) PDF 1.97 M (3689) Comment (0) Favorites

Abstract:Directed grey-box fuzzing measures the effectiveness of seeds for detecting the execution path towards the target. In addition to the closeness between the triggered execution and the target code lines, the ability to explore diversified execution paths is also important to avoid local optimum. Current directed grey-box fuzzing methods measure this capability by coverage counting of the whole program. But only a part of the program is responsible for the calculation of the target state. If the new seed brings target irrelevant state changes, it cannot enhance the queue for state exploration. What is worse, it may distract the concentration of the fuzzer and waste time on exploring target irrelevant code logic. To solve this problem, this study provides a valid coverage guided directed grey-box fuzzing method. The static program slicing technique is used to locate the code region that can affect the target state and detect interesting seeds that bring new differences in coverage of this code region. By enlarging the energy of these seeds and reducing others (adjusting power schedule), the fuzzer can be guided to focus on seeds that can help explore different control flow that target depends and mitigate the interference of redundant seeds. The experiment on the benchmark provided shows that this strategy brings significant performance improvement for AFLGO.

User Feedback in Firefox Bug Tracking System

WANG Yan , WU Hua-Yao , NIE Chang-Hai , XU Jia-Xi , YIN Zhen , NIU Xin-Tao

2022, 33(11):3983-4007. DOI: 10.13328/j.cnki.jos.006332

Abstract (681) HTML (1794) PDF 2.99 M (1873) Comment (0) Favorites

Abstract:Bug tracking systems are a vital part of software project management. It is a necessary means to ensure the smooth development of modern large-scale open source software and continuously improve software quality. Most open source software ecosystems currently use open bug tracking systems to manage software bugs. It allows users to submit system failures (called defect bugs) and suggestions for system improvements (called enhancement bugs), but the role of feedback from these users has not been fully studied. Therefore, this work conducted an empirical study on the bug tracking system used by Firefox, and collected 19 474 and 3 057 bug reports submitted in 2018 and 2019 for Firefox Desktop and Firefox for Android, respectively. Based on this, it is compared and analyzed the differences between the number, severity, distribution on components, fixing rate, fixing efficiency and assignees of bugs submitted by ordinary users and core developers, and at the same time, the relationship between the quality of bug reports and the fixing rate and efficiency of bugs is investigated. The main findings are as follows. (1) There are a large number of ordinary users, but their participation is still superficial. 86% of ordinary users have only submitted one bug and no more than 3% of bugs are of high severity. (2) The bugs submitted by ordinary users mainly distributed on UI components related to user interaction (e.g., address bar, audio/video, etc.), but there are also 43% of bugs that are difficult to accurately locate due to lack of sufficient description information. (3) In terms of bug processing results, due to the simple design of the duplicate checking system and bug filling system, a large number of bugs are treated as "useless" ones, and the fixing rate is less than 10%. (4) In the bug fixing process, due to the difficulty of ordinary users to accurately and fully describe bugs, the system does not pay enough attention to them, thus the process of bugs submitted by ordinary users is more complicated than that of core developers, and it takes at least 8 more days on average to fix them. These results reveal the shortcomings of the current bug tracking system in terms of user participation incentive mechanism, automatic bug duplicate checking, and intelligent assistance in filling out bug reports, which can provide help for the system developers and managers to improve system and enhance the contributions of ordinary users to open source software.

Fine-grained Bug Location Method Based on Source Code Extension Information

LI Xiao-Zhuo , QING Du-Jun , HE Ye-Ping , MA Heng-Tai

2022, 33(11):4008-4026. DOI: 10.13328/j.cnki.jos.006339

Abstract (1711) HTML (1391) PDF 2.32 M (3904) Comment (0) Favorites

Abstract:Bug location based on information retrieval (IR) uses cross language semantic similarity to construct a retrieval model to locate source code errors through bug report. However, the traditional method of bug location based on IR treats the code as pure text and only uses the lexical semantic information of source code, which leads to the problem of low accuracy caused by the lack of candidate code semantics in fine-grained bug location, and the usefulness of the results needs to be improved. By analyzing the relationship between code change and bug generation in the scenario of program evolution, this study proposes a fine-grained bug location method based on source code extension information, the explicit semantic information of code vocabulary and implicit information of code execution are used to enrich source code semantics to realize fine-grained bug location. Based on the location candidate points, the semantic context is used to enrich the code quantity, and the structural semantics of code execution intermediate language is used to realize fine-grained code distinguishability. Meanwhile, natural language semantics is used to guide the generation of code language representation based on attention mechanism, the semantic mapping between fine-grained code and natural language is implemented to implement fine-grained bug location method FlowLocator. The experimental results show that compared with the classical IR bug location method, the location accuracy of this method is significantly improved in the Top-N rank, mean average precision (MAP) and mean reciprocal rank (MRR).

Automatic Code Semantic Tag Generation Approach Based on Software Knowledge Graph

XING Shuang-Shuang , LIU Ming-Wei , PENG Xin

2022, 33(11):4027-4045. DOI: 10.13328/j.cnki.jos.006369

Abstract (2486) HTML (3522) PDF 2.93 M (4266) Comment (0) Favorites

Abstract:Code snippets in open-source and enterprise software projects and posted on various software development websites are important software development resources. However, developer's needs for code search often reflect high-level intentions and topics, which are difficult to be satisfied through code search techniques based on information retrieval. It is thus highly desirable that code snippets can be accompanied with semantic tags reflecting their high-level intentions and topics to facilitate code search and understanding. Existing tag generation technologies are mainly oriented to text content or rely on historical data, and cannot meet the needs of large-scale code semantic annotation and auxiliary code search and understanding. Targeted at the issue, this study proposes an approach based on software knowledge graph (called KGCodeTagger) that automatically generates semantic tags for code snippets. KGCodeTagger constructs a software knowledge graph based on concepts and relations extracted from API documentations and software development Q&A text and uses the knowledge graph as the basis of code semantic tag generation. Given a code snippet, KGCodeTagger identifies and extracts API invocations and concept mentions, and then links them to the corresponding concepts in the software knowledge graph. On this basis, the approach further identifies other concepts related to the linked concepts as candidates and selects semantic tags from relevant concepts based on the diversity and representativeness. The software knowledge graph construction steps of KGCodeTagger and the quality of the generated code tags are evaluated. The results show that KGCodeTagger can produce high-quality and meaningful software knowledge graph and code semantic tags, which can help developers quickly understand the intention of the code.

God Class Detection Approach Based on Graph Model and Isolation Forest

LIU Yi , WU Yi-Jian , PENG Xin , YAN Ya-Dong

2022, 33(11):4046-4060. DOI: 10.13328/j.cnki.jos.006373

Abstract (1710) HTML (2074) PDF 2.07 M (3313) Comment (0) Favorites

Abstract:God class refers to a class that carries heavy tasks and responsibilities. The common feature of God class is that it contains a large number of attributes and methods, and has multiple dependencies with other classes in the system. God class is a typical code smell, which has a negative impact on the development and maintenance of the software. In recent years, many studies have been devoted to discovering or refactoring the God class; however, the detection ability of existing methods is not strong, and the detection precision is not high enough. This study proposes a God class detection approach based on graph model and isolation forest algorithm, which can be divided into two stages:The stage of the graph structure information analysis and the stage of intra-class measurement evaluation. In the stage of the graph structure information analysis, inter-class method call graphs and intra-class structure graphs are established, respectively. The isolation forest algorithm is used to reduce the detection range of God class. In the stage of the intra-class measurement evaluation, the impact of the scale and architecture of the project is taken into account, and the average value of the God class related measurement indicators in the project is used as the benchmark. An experiment is designed to determine the scale factors, and the product of the average value and the scale factors are used as the threshold for the detection to obtain the God class detection result. The experimental results on the code smell benchmark data set show that the method proposed in this article improves the precision and F1 value by 25.8 percentage points and 33.39 percentage points, respectively, compared to an existing God class detection method, with a high recall at the same time.

FaaS Migration Approach for Monolithic Applications Based on Dynamic and Static Analysis

XIANG Qi-Lin , PENG Xin , AKASAKA Isami , LI Bo-Wen

2022, 33(11):4061-4083. DOI: 10.13328/j.cnki.jos.006377

Abstract (1394) HTML (2023) PDF 2.95 M (3001) Comment (0) Favorites

Abstract:As a typical form of the Serverless architecture, the function as a service (FaaS) architecture abstracts the business into fine-grained functions, and provides automatic operation and maintenance functionality such as auto-scaling, which can greatly reduce the operation and maintenance costs. Some of the high concurrent, high available, and high flexible services (such as payment, red packet, etc.) in many online service systems have been migrated to the FaaS platform, but a large number of traditional monolithic applications still find it difficult to take advantage of the FaaS architecture. In order to solve this problem, a FaaS migration approach for monolithic applications based on dynamic and static analysis is proposed in this study. This approach identifies and strips the implementation code and dependencies for the specified monolithic application API by combining dynamic and static analysis, and then completes the code refactoring according to the function template. Aiming at the cold-start problem of functions in high concurrency scenario, this approach uses the master-slave multithreaded Reactor model based on IO multiplexing to optimize the function template and improve the concurrency processing capability of a single function instance. Based on this approach, Codext, a prototype tool for Java language, is implemented and experimental verification is carried out on OpenFaaS, an open source Serverless platform, for four open source monolithic applications.

Overview on Parallel Execution Models of Smart Contract Transactions in Blockchains

SHI Jian-Feng , WU Heng , GAO He-Ran , ZHANG Wen-Bo

2022, 33(11):4084-4106. DOI: 10.13328/j.cnki.jos.006528

Abstract (2524) HTML (4602) PDF 18.15 M (5305) Comment (0) Favorites

Abstract:Blockchains such as Ethereum serially execute smart contract transactions in a block, which can strictly guarantee the consistency of the blockchain state between nodes after execution, but it has become a serious bottleneck restricting the throughput of these blockchains. Therefore, the use of parallel methods to optimize the execution of smart contract transactions has gradually become the focus of industry and academia. This study summarizes the research progresses of the parallel execution methods of smart contracts in blockchains, and proposes a research framework. From the perspective of the phases of parallel execution of smart contracts, the framework condenses four parallel execution models of smart contracts, namely the parallel execution model based on static analysis, the parallel execution model based on dynamic analysis, the parallel execution model between nodes and the divide-and-conquer parallel execution model, and describes the typical parallel execution methods under each model. Finally, this study discusses the factors affecting parallel execution such as the transaction dependency graph and concurrency control strategies, and proposes future research directions.

Survey on Generating Database Queries Based on Natural Language

LIU Xi-Ping , SHU Qing , HE Jia-Hao , WAN Chang-Xuan , LIU De-Xi

2022, 33(11):4107-4136. DOI: 10.13328/j.cnki.jos.006539

Abstract (1655) HTML (4892) PDF 28.22 M (4781) Comment (0) Favorites

Abstract:Database can provide efficient storage and access for massive data. However, it is nontrivial for non-experts to command database query language like SQL, which is essential for querying databases. Hence, querying databases using natural language (i.e., text-to-SQL) has received extensive attention in recent years. This study provides a holistic view of text-to-SQL technologies and elaborates on current advancements. It first introduces the background of the research and describes the research problem. Then the study focuses on the current text-to-SQL technologies, including pipeline-based methods, statistical-learning-based methods, as well as techniques developed for multi-turn text-to-SQL task. The study goes further to discuss the field of semantic parsing to which text-to-SQL belongs. Afterward, it introduces the benchmarks and evaluation metrics that are widely used in the research field. Moreover, it compares and analyzes the state-of-the-art models from multiple perspectives. Finally, the study summarizes the potential challenges for text-to-SQL task, and gives some suggestions for future research.

YU Ying-Chao , GAN Shui-Tao , QIU Jun-Yang , QIN Xiao-Jun , CHEN Zuo-Ning

2022, 33(11):4137-4172. DOI: 10.13328/j.cnki.jos.006540

Abstract (1607) HTML (2672) PDF 30.70 M (3370) Comment (0) Favorites

Abstract:In the era of today’s Internet of Things, embedded systems are becoming important components for accessing the cloud, which are used in both secure and privacy-sensitive applications or devices frequently. However, the underlying software (a.k.a. firmware) often suffered from a wide range of security vulnerabilities. The complexity and heterogeneous of the underlying hardware platform, the difference of the hardware and software implementation, the specificity and limited document, together with limited running environment made some of very good dynamic testing tools for desktop systems hard to (even impossible) be adapted to embedded devices/firmware environment directly. In recent years, researchers have made great progress in detecting well-known vulnerabilities in embedded device firmware based on binary code similarity analysis. Focusing on the key technical challenges of binary code similarity analysis, the existing binary code similarity analysis technologies are studied systematically; the general process, technical characteristics, and evaluation criteria of these technologies are analyzed and compared comprehensively. Then, the application of these technologies is analyzed and summarized in the field of embedded device firmware vulnerability search. At last, some technical challenges in this field are presented and some open future research directions are proposed for the related researchers.

Overview on 2D Human Pose Estimation Based on Deep Learning

ZHANG Yu , WEN Guang-Zhao , MI Si-Ya , ZHANG Min-Ling , GENG Xin

2022, 33(11):4173-4191. DOI: 10.13328/j.cnki.jos.006390

Abstract (3309) HTML (5851) PDF 2.91 M (12383) Comment (0) Favorites

Abstract:Human pose estimation is a basic and challenging task in the field of computer vision. It is the basis for many of computer vision tasks, such as action recognition and action detection. With the development of deep learning methods, deep learning-based human pose estimation algorithms have shown excellent results. This study divides pose estimation methods into three categories, including single person pose estimation, top-down multi-person pose estimation, and bottom-up multi-person pose estimation. The development of 2D human pose estimation algorithms in recent years is introduced, and the current challenges of two-dimensional human pose estimation are discussed. Finally, the outlook for the future development of human pose estimation is given.

Survey on Joint Modeling Algorithms for Spoken Language Understanding Based on Deep Learning

WEI Peng-Fei , ZENG Bi , WANG Ming-Hui , ZENG An

2022, 33(11):4192-4216. DOI: 10.13328/j.cnki.jos.006385

Abstract (1738) HTML (4337) PDF 3.19 M (4618) Comment (0) Favorites

Abstract:Spoken language understanding is one of the hot research topics in the field of natural language processing. It is applied in many fields such as personal assistants, intelligent customer service, human-computer dialogue, and medical treatment. Spoken language understanding technology refers to the conversion of natural language input by the user into semantics representation, which mainly includes 2 sub-tasks of intent recognition and slot filling. At this stage, the deep modeling of joint recognition methods for intent recognition and slot filling tasks in spoken language understanding has become mainstream and has achieved sound results. Summarizing and analyzing the joint modeling algorithm of deep learning for spoken language learning is of great significance. First, it introduces the related work to the application of deep learning technology to spoken language understanding, and then the existing research work is analyzed from the relationship between intention recognition and slot filling. The experimental results of different models are compared and summarized. Finally, the challenges that future research may face are prospected.

Survey on Deep Reinforcement Learning Methods Based on Sample Efficiency Optimization

ZHANG Jun-Wei , Lü Shuai , ZHANG Zheng-Hao , YU Jia-Yu , GONG Xiao-Yu

2022, 33(11):4217-4238. DOI: 10.13328/j.cnki.jos.006391

Abstract (2283) HTML (4462) PDF 2.56 M (5906) Comment (0) Favorites

Abstract:Deep reinforcement learning combines the representation ability of deep learning with the decision-making ability of reinforcement learning, which has induced great research interest due to its remarkable effect in complex control tasks. This study classifies the model-free deep reinforcement learning methods into Q-value function methods and policy gradient methods by considering whether the Bellman equation is used, and introduces the two kinds of methods from the aspects of model structure, optimization process, and evaluation, respectively. Toward the low sample efficiency problem in deep reinforcement learning, this study illustrates that the over- estimation problem in Q-value function methods and the unbiased sampling constraint in policy gradient methods are the main factors that affect the sample efficiency according to model structure. Then, from the perspectives of enhancing the exploration efficiency and improving the sample exploitation rate, this study summarizes various feasible optimization methods according to the recent research hotspots and trends, analyzes advantages together with existing problems of related methods, and compares them according to the scope of application and optimization effect. Finally, this study proposes to enhance the generality of optimization methods, explore migration of optimization mechanisms between the two kinds of methods, and improve theoretical completeness as future research directions.

Multi-turn Dialogue Generation Model with Dialogue Structure

JIANG Xiao-Tong , WANG Zhong-Qing , LI Shou-Shan , ZHOU Guo-Dong

2022, 33(11):4239-4250. DOI: 10.13328/j.cnki.jos.006340

Abstract (1653) HTML (2014) PDF 1.92 M (4385) Comment (0) Favorites

Abstract:Recent research on multi-turn dialogue generation has focused on RNN or Transformer-based encoder-decoder architecture. However, most of these models ignore the influence of dialogue structure on dialogue generation. To solve this problem, this study proposes to use graph neural network structure to model the dialogue structure information, thus effectively describing the complex logic within a dialogue. Text-based similarity relation structure, turn-switching-based relation structure, and speaker-based relation structure are proposed for dialogue generation, and graph neural network is employed to realize information transmission and iteration in dialogue context. Extensive experiments on the DailyDialog dataset show that the proposed model consistently outperforms other baseline models in many indexes, which indicates that the proposed model with graph neural network can effectively describe various correlation structures in dialogue, thus contributing to the high-quality dialogue response generation.

Interval-valued Intuitionistic Fuzzy Knowledge Measure with Applications Based on Hamming-Hausdorff Distance

GUO Kai-Hong , WANG Zi-Qing

2022, 33(11):4251-4267. DOI: 10.13328/j.cnki.jos.006333

Abstract (511) HTML (1435) PDF 2.89 M (2141) Comment (0) Favorites

Abstract:A Hamming-Hausdorff distance-based interval-valued intuitionistic fuzzy knowledge measure (IVIFKM) is presented in this paper, upon with a methodology for image thresholding is based so as to achieve a better segmentation result. The latest achievement shows that there are two significant facets of knowledge measurement associated with an intuitionistic fuzzy set (IFS), i.e., the information content and the information clarity. With this understanding, a novel axiomatic system of IVIFKM is proposed. The normalized Hamming-Hausdorff distance is also improved and extended. Combined with the technique for order preference by similarity to ideal solution (TOPSIS), a novel IVIFKM is then established, complying fully with the requirement of the developed axiomatic system. The proposed measure is subsequently applied to image thresholding. Given the structural features of an interval-valued IFS (IVIFS) in itself, a more effective classification rule of pixels and a more efficient algorithm for interval-valued intuitionistic fuzzification of an image are suggested, respectively. The developed measure is finally used to calculate the amount of knowledge associated with the image to determine the best threshold for segmentation. Experimental results show that the developed knowledge-driven methodology, characterized by high stability and reliability, can produce much more satisfactory binary images with excellent performance metrics, routinely outperforming other thresholding ones. By this work, the latest IVIFKM theory is introduced into the field of image processing, thus providing a concrete instance for the potential applications of this theory in other related areas.

Feature Generation Approach with Indirect Domain Adaptation for Transductive Zero-shot Learning

HUANG Sheng , YANG Wan-Li , ZHANG Yi , ZHANG Xiao-Hong , YANG Dan

2022, 33(11):4268-4284. DOI: 10.13328/j.cnki.jos.006336

Abstract (756) HTML (1596) PDF 3.92 M (2212) Comment (0) Favorites

Abstract:In recent years, zero-shot learning has attracted extensive attention in machine learning and computer vision. The conventional inductive zero-shot learning attempts to establish the mappings between semantic and visual features for transferring the knowledge between classes. However, such approaches suffer from the projection domain shift between the seen and unseen classes. The transductive zero-shot learning is proposed to alleviate this issue by leveraging the unlabeled unseen data for domain adaptation in the training stage. Unfortunately, empirically study finds that these transductive zero-shot learning approaches, which optimize the semantic mapping and domain adaption in visual feature space simultaneously, are easy to trap in "mutual restriction", and thereby limit the potentials of both these two steps. In order to address the aforementioned issue, a novel transductive zero-shot learning approach named feature generation with indirect domain adaption (FG-IDA) is proposed, that conducts the semantic mapping and domain adaption orderly and optimizes these two steps in different spaces separately for inspiring their performance potentials and further improving the zero-shot recognition accuracy. FG-IDA is evaluated on four benchmarks, namely CUB, AWA1, AWA2, and SUN. The experimental results demonstrate the superiority of the proposed method over other transductive zero-shot learning approaches, and also show that FG-IDA achieves the state-of-the-art performances on CUB, AWA1, and AWA2 datasets. Moreover, the detailed ablation analysis is conducted and the results empirically prove the existence of the "mutual restriction" effect in direct domain adaption-based transductive zero-shot learning approaches and the effectiveness of the indirect domain adaption idea.

Verifiable Encrypted Medical Data Aggregation and Statistical Analysis Scheme

ZHANG Xiao-Jun , ZHANG Jing-Wei , HUANG Chao , GU Da-Wu , ZHANG Yuan

2022, 33(11):4285-4304. DOI: 10.13328/j.cnki.jos.006343

Abstract (700) HTML (1581) PDF 2.53 M (2110) Comment (0) Favorites

Abstract:Due to the fast development of mobile communication networks, more and more wearable devices access the network through mobile terminals and produce massive data. These aggregated medical data have significant statistical analysis and decision making value. Nevertheless, there are emerge security and privacy issues (e.g., as transmission interruption, information leakage, and data tampering) in medical data transmission and aggregation process. To address those security issues and ensure accurate medical data aggregation and analysis, an efficient verifiable fault-tolerant medical data aggregation scheme is proposed based on mobile edge service computing. The scheme exploits a modified BGN homomorphic encryption algorithm, integrates Shamir secret sharing mechanism to ensure medical data confidentiality, fault tolerance of encrypted aggregated data, simultaneously. The concept of mobile edge-assisted service computing in wireless body area networks is proposed in the scheme. Combined with the advantages of mobile edge computing and cloud computing, the real-time big data processing and statistical analysis of massive medical big data could be conducted. Through edge-level aggregation and cloud-level aggregation, the aggregation efficiency is improved and the communication overhead is reduced. Besides, the scheme designs an aggregate signature algorithm to conduct batch verification on medical encrypted data, and guarantee the integrity during transmission and storage process. The comprehensive performance evaluation demonstrates that the proposed scheme has outstanding advantages in terms of computational costs and communication overhead.

Traceable Universal Designated Verifier Signature Proof Scheme

TANG Fei , MA Shuai , MA Chun-Liang

2022, 33(11):4305-4315. DOI: 10.13328/j.cnki.jos.006317

Abstract (520) HTML (1482) PDF 1.67 M (1965) Comment (0) Favorites

Abstract:To solve the problem of unfairness for verifier in the traditional universal designated verifier signature proof scheme because of the strong privacy-preserving property, the notion of traceable universal designated verifier signature proof (TUDVSP) was proposed. In this new kind of conditional privacy-preserving authentication scheme, a tracing center was introduced which can recover the transformed signature to the original one, and thus avoid the signer collude the delegator to cheat the verifier. Based on the consideration of real-word applications, security model which contains unforgeability, security against impersonation attack, and traceability for TUDVSP scheme was proposed. By using bilinear map, a concrete TUDVSP scheme was proposed, and the unforgeability, security against impersonation attack, and traceability of the proposed scheme were also proved. The experimental results indicate that it only takes about 21 ms of computation cost and 120 byte of communication overhead.

Secure Sorting Protocols and Their Applications

DOU Jia-Wei , WANG Yu-Lin

2022, 33(11):4316-4333. DOI: 10.13328/j.cnki.jos.006326

Abstract (586) HTML (1599) PDF 2.07 M (1842) Comment (0) Favorites

Abstract:Secure multi-party computation (SMC) is a focus in the international cryptographic community in recent years. Sorting is a basic data operation and a basic problem of algorithm design and analysis. Secure multiparty sorting is the generalization of the millionaires' problem and a basic problem of SMC. It can be extensively used in scientific decision-making, e-commerce recommendation, electronic auction and bidding, anonymous voting and privacy-preserving data-mining, etc. Most existing solutions to sorting problem are applicable to the cases that the private data is known and small. If the data range is not known, they do not work. If the data range is very large, they will be very inefficient. Unfortunately, in practice, many application scenarios fall in these categories. To privately sort data in scenarios that data range is unknown or the data range is very large, two protocols are proposed first for these scenarios where the data range is small or is known to preserve the privacy of data:the scheme where the same data occupy the same order and that where the same data occupy different orders. Then, these protocols are used as building blocks to design schemes to solve the sorting problem in scenarios that data range is unknown or the data range is very large. The proposed new secure sorting protocols can be used as building blocks to solve many practical problems that inherently need sorting. Based on these protocols, a secure and efficient Vickrey auction protocol is designed. Encoding technique and threshold decryption ElGamal cryptosystem are flexibly used to design these protocols. Using the simulation paradigm, it is proved that the protocols are secure in the semi-honest model. Finally, the efficiency of the protocols are tested. The experimental results show that the proposed protocols are efficient.

State-of-the-art Survey on Deterministic Transmission Technologies in Time-sensitive Networking

LI Zong-Hui , YANG Si-Qi , YU Jing-Hai , DENG Yang-Dong , WAN Hai

2022, 33(11):4334-4355. DOI: 10.13328/j.cnki.jos.006524

Abstract (3469) HTML (4615) PDF 19.42 M (5762) Comment (0) Favorites

Abstract:Time-sensitive networking (TSN) is an important research area to update infrastructure of industrial internet of things. Deterministic transmission in TSN is the key technology, mainly including time-triggered scheduling in control plane, mixed-criticality transmission, and deterministic delay analysis, to support deterministic real-time transmission requirements for industrial control. This study surveys the related works on deterministic transmission technologies of TSN in recent years and systematically cards and summarizes them. First, this study presents the different transmission models of different kinds of flows in TSN. Second, based on these models, on the one hand, this study presents time-triggered scheduling model and its research status and existing challenges on control plane. On the other hand, this paper presents the architecture of TSN switches, the strategies of mixed-criticality transmission and their disadvantages and the corresponding improvement approaches. Third, this study models the transmission delay of the whole TSN based on netowork calculus and presents the delay analysis methods, their research status and possible improvement directions. Finally, this study summarizes the challenges and research prospects of deterministic transmission technologies in TSN.

Rotation-invariant Deep Hierarchical Cluster Network for Point Cloud Analysis

LI Guan-Bin , ZHANG Rui-Fei , CHEN Chao , LIN Liang

2022, 33(11):4356-4378. DOI: 10.13328/j.cnki.jos.006315

Abstract (944) HTML (1457) PDF 3.40 M (1825) Comment (0) Favorites

Abstract:In recent years, since the solution of permutation invariance of point cloud, point cloud based deep learning methods have achieved great breakthrough. Point cloud is adopted as input data to describe 3D objects and then neural network is employed to extract features from the point cloud. However, the existing methods cannot solve the rotation-invariance problem, thus existing models are of poor robustness against rotation. Meanwhile, the existing methods merely design the hierarchical structure of neural network by prior knowledge and none of them have made effort to explore the geometric structure underlying the point cloud, which is prone to cause lower capacity of network. For these reasons, a point cloud representation with rotation-invariance and a hierarchical cluster network are proposed, attempting to solve the above two problems in both theoretical and practical ways. Extensive experiments have shown that the proposed method greatly outperforms the state-of-the-arts in rotation robustness on rotation-augmented 3D object classification, object part segmentation, object semantic segmentation benchmarks.

Food Image Recognition via Multi-scale Jigsaw and Reconstruction Network

LIU Yu-Xin , MIN Wei-Qing , JIANG Shu-Qiang , RUI Yong

2022, 33(11):4379-4395. DOI: 10.13328/j.cnki.jos.006325

Abstract (1041) HTML (1686) PDF 2.37 M (2053) Comment (0) Favorites

Abstract:Recently, food image recognition has received more and more attention for its wide applications in healthy diet management, smart restaurant, and so on. Unlike other object recognition tasks, food images belong to fine-grained ones with high intra-class variability and inter-class similarity. Furthermore, food images do not have fixed semantic patterns and specific spatial layout. These make food recognition more challenging. This study proposes a multi-scale jigsaw and reconstruction network (MJR-Net) for food recognition. MJR-Net is composed of three parts. The jigsaw and reconstruction module uses a method called destruction and reconstruction learning to destroy and reconstruct the original image to extract local discriminative details. Feature pyramid module can fuse mid-level features of different sizes to capture multi-scale local discriminative features. Channel-wise attention module can model the importance of different feature channels to enhance the discriminative visual patterns and weaken the noise patterns. The study also uses both A-softmax loss and Focal loss to optimize the network by increasing the inter-class variability and reweighting samples respectively. MJR-Net is evaluated on three food datasets (ETH Food-101, Vireo Food-172, and ISIA Food-500). The proposed method achieves 90.82%, 91.37%, and 64.95% accuracy, respectively. Experimental results show that, compared with other food recognition methods, MJR-Net shows greater competitiveness and especially achieves the state-of-the-art recognition performance on Vireo Food-172 and ISIA Food-500. Comprehensive ablation studies and visual analysis also prove the effectiveness of the proposed method.

Distributed Edge Caching Scheme Using Non-cooperative Game

GU Hui-Xian , WANG Hai-Jiang , WEI Gui-Yi

2022, 33(11):4396-4409. DOI: 10.13328/j.cnki.jos.006322

Abstract (681) HTML (1581) PDF 1.86 M (1954) Comment (0) Favorites

Abstract:Due to the rapid growth of multimedia data traffic, the traditional cloud computing model has been greatly challenged in satisfying users' demands for low latency and high bandwidth. Therefore, edge computing is becoming an emerging computing paradigm. The computing capacity of edge devices such as base stations and the short distance between users and base stations enable users to obtain higher service quality. It is still a challenging problem to design edge caching strategy based on the relationship between benefits and costs of edge nodes. Using 5G and collaborative edge computing technology, in a large number of short video application scenarios, this study proposes a collaborative edge caching technology to simultaneously solve the following three problems:(1) by reducing the transmission delay, to improve users' service experience; (2) by cutting down transmission latency to reduce the data transmission pressure of the backbone network; (3) through distributed computing to reduce the workload of the cloud servers. First, a collaborative edge caching model is defined where the edge nodes are equipped with limited storage space, mobile users can access to edge nodes, one node can serve multiple users. Second, a non-cooperative game model is designed to study the cooperative behavior between edge nodes. Each edge node is treated as a player and can make cache initialization and cache replacement strategies. Thirdly, the Nash equilibrium of the game is found, and then a distributed algorithm is designed to reach the equilibrium. Finally, the simulation results show that the proposed edge caching strategy can reduce the latency of users by 20% and reduce the traffic of backbone network by 80%.

微信服务号

微信订阅号

>Review Articles

>Review Articles

>Review Articles

Current Issue

Volume

Issue