XIE Cheng-Wang , GUO Hua , WEI Wei , JIANG Lei
2023, 34(4):1523-1542. DOI: 10.13328/j.cnki.jos.006702
Abstract:It is difficult to solve many-objective optimization problems (MaOPs) effectively by using the traditional multi-objective evolutionary algorithms (MOEAs) based on Pareto dominance relation. A dominance relation is proposed firstly by combing double distances of PBI utility function without introducing extra parameter. Secondly, a diversity maintenance method based on double distances is also defined, which not only considers the double distances of the individual, but also adaptively adjusts the weight of diversity according to the objective number of MaOP, so as to better balance the convergence and diversity of the solution set in many-objective space. Finally, the proposed dominance relation and diversity maintenance method are embedded into the framework of NSGA-II, and then a many-objective evolutionary algorithm based on double distances (MaOEA/d2) is designed. The MaOEA/d2 is compared with other five representative many-objective evolutionary algorithms on the DTLZ and WFG benchmark functions with 5-,10-,15-, and 20-objective in terms of IGD and HV indicators. The empirical results show that MaOEA/d2can obtain better convergence and diversity. Therefore, the proposed MaOEA/d2is a promising many-objective evolutionary algorithm.
ZHAI Zhi-Nian , LU Ya-Hui , LIU Guan-Jun , LEI Jing-Sheng , XIANG Jian , WU Ming-Wei
2023, 34(4):1543-1569. DOI: 10.13328/j.cnki.jos.006682
Abstract:Workflow satisfiability problem is an elemental issue in the security planning of business process, and it is facing the performance challenge caused by high resource ratio (the number n of resources is significantly greater than the number k of steps). Under resource-independent constraints, its most efficient approach is incremental pattern backtracking (IPB) in the pattern space. To overcome the performance bottleneck of verifying whether a node is authentic, IPB computes the k-assignment (bipartite) graph of a pattern and its (left complete) matching in an incremental manner, which requires O(kn) and O(k2) time respectively. This study computes a full-assignment graph incrementally with only O(n) time by exploiting the atomic difference between a sub-pattern and its super one, and in particular its actual performance will increase rapidly with the size of a block in pattern. However, the size O(kn) of such a graph will result in the same incremental matching time. Further, this study introduces the concept of complete k core matching and shows that its existence is equivalent to a left complete matching and its incremental computation only costs O(k2) time. Therefore, this study proposes an algorithm of minimum incremental pattern backtracking (MIPB) that is superior to IPB in time complexity. Experiments are conducted on an extended public instance set with constraints of two global value-cardinality types and of the mutual exclusion, and with an authorization ratio of about 1/4. The results show that: when k varies at n/k=10 (n/k=100, respectively), MIPB achieves averagely more than 2 (5, respectively) times and at least 1.5 (2.9, respectively) times advantage of performance compared to IPB; when k=18 (k=36, respectively), and n/k belongs to 2~4096 (2~2048, respectively), MIPB achieves averagely more than 2.6 (3.6, respectively) times advantage. While compared to the champion solver OR-Tools CP-SAT in the 2018~2021 Minizinc Challenges, MIPB achieves at least more than 3 times advantage.
TAO Xiao-Han , ZHU Yu , PANG Jian-Min , ZHAO-Jie , XU Jin-Long
2023, 34(4):1570-1593. DOI: 10.13328/j.cnki.jos.006688
Abstract:Heterogeneous architectures are dominating the realm of high-performance computing. However, these architectures also complicate the programming issue due to its increasingly complex hardware and memory hierarchy compared to homogeneous architectures. One of the most promising solutions to this issue is making use of optimizing compilers which can help programmers develop high-performance code executable on target machines, thereby simplifying the difficulty of programming. The polyhedral model is widely studied due to its ability to generate effective code and portability to various targets, which is realized by first converting a program into its intermediate representation and then combining the compositions of loop transformations and hardware binding strategies. This paper presents a source-to-source parallel code generator targeting the domestic, heterogeneous architecture of the Sunway machine using the polyhedral model. In particular, the computation is deployed automatedly onto the Sunway architecture and memory management, minimizing the amount of data movements between the management processing element and computing processing elements of the target. The experiments are conducted on 13 linear algebra applications extracted from the Polybench Benchmarks. The experimental results show that the proposed approach can generate effective code executable on the Sunway heterogeneous architecture, providing a mean speedup of 539.16× on 64 threads over the sequential implementation executed on a management processing element.
WANG Wen-Xiang , GAO Qing , XU Ke , ZHANG Shi-Kun
2023, 34(4):1594-1612. DOI: 10.13328/j.cnki.jos.006691
Abstract:Software crash is a kind of serious software flaw, which can lead to software crashes. Therefore, testing for software crashes is extremely important in the process of software iteration. In recent years, since a large number of test inputs can be automatically generated to trigger software crashes, fuzzing techniques (such as AFL) are widely used in software testing. Nevertheless, most of root causes of crashes that are generated by this technique are same. In this case, software developers have to classify the test inputs one by one, which brings a lot of redundant work. At present, there are many automated methods for testing input classification, mainly including classification algorithms based on program repair and classification algorithms based on software crash information. The former analyzes the program semantics, and re-runs the test input after replacing the repair templates in the program at runtime, and then classifies the inputs. Since this method requires the preparation of repair templates to be completed artificially, the efficiency of its classification is closely related to the quality of the repair templates. At the same time, the repair efficiency of the software has been greatly affected due to the need to repair the crash and classify the crash. Since certain advantages of the latter, this study proposes a lightweight and efficient test inputs classification algorithm, which uses software crash information. Based on the algorithm of software crash point stack information classification, this study introduces dynamic link library information in analyzing CICELY. By distinguishing system dynamic link library from user dynamic link library and combining with location information of user codes, this study gets the set of functions that are focused by programmers to define the crash based on the user function in the classification. In the end, this study also compares CICELY with some existing classification tools based on program repair and software crash information. The experimental test data sets total 19 projects, and 42 test sets. When comparing with other classification tools, Honggfuzz and CERT BFF, whose main classification algorithms are based on software crash information on the same data set, the numbers of classification results of the two are 2112.89% and 135.05% worse than that of CICELY, proving that the experimental effect of CICELY is greatly improved and has higher accuracy compared with similar algorithms. Compared with the classification algorithm "Semantic Crash Bucketing" based on program repair using the test data set provided in their article, CICELY is worse than it by 4.42%. When using the test set consisting of test inputs corresponding to multiple crashes, CICELY got 3% higher repeatability than it. However, Semantic Crash Bucketing can only classify crashes caused by two kinds of crash inputs, null pointer dereference and buffer overflow, while CICELY is not subject to such restrictions.
WANG Ya-Wen , WANG Jun-Jie , SHI Lin , WANG Qing
2023, 34(4):1613-1629. DOI: 10.13328/j.cnki.jos.006697
Abstract:App reviews are considered as a communication channel between users and developers to perceive user’s satisfaction. Users usually describe buggy features (i.e., user actions) and App abnormal behaviors (i.e., abnormal behaviors) in forms of key phrases (e.g., “send a video” and “crash”), which could be buried with other trivial information (e.g., complaints) in the review texts. A fine-grained view about this information could facilitate the developers’ understanding of feature requests or bug reports from users, and improve the quality of Apps. Existing pattern-based approaches to extract target phrases can only summarize the high-level topics/aspects of reviews, and suffer from low performance due to insufficient semantic understanding of reviews. This study proposes a semantic-aware and fine-grained App review bug mining approach (Arab) to extract user actions and abnormal behaviors, and mine the correlations between them. A novel neural network model is designed for extracting fine-grained target phrases, which combines textual descriptions and review attributes to better represent the semantics of reviews. Arab also clusters the extracted phrases based on their semantic relations and provides a visualization of correlations between User Actions and Abnormal Behaviors. 3,426 reviews from six Apps are used to carry out evaluation test, and the results confirm the effectiveness of Arab in phrase extraction. A case study is further conducted with Arab on 301,415 reviews of 15 popular Apps to explore its potential application and examine its usefulness on large-scale data.
ZHONG Yi , SHI Meng-Yu , FANG Chun-Rong , ZHAO Zhi-Hong , CHEN Zhen-Yu
2023, 34(4):1630-1649. DOI: 10.13328/j.cnki.jos.006701
Abstract:Automated testing tools are the primary means of quality assurance for Android applications. With the increase in Android version diversity, underlying hardware variability (fragmentation), and logical complexity, automated testing faces new challenges. Numerous automated testing tools have been developed in recent years to address the above issues. However, there are vast tools with various testing focuses, making it hard for testers to choose the right one. To help testers select the best tool for testing and achieve a unified evaluation for automated testing tools, a multi-characteristic comprehensive evaluation of the Android automated testing (CEAT) method is proposed and an easy-to-use platform is implemented for testers. CEAT introduces three widely accepted evaluation metrics: code coverage, exception detection rate, fusion multi-version compatibility score, and further introduces mutation kill rate based on the mutation testing concept, and UI control widget coverage from the perspective of the user. The five metrics constitute the whole CEAT system, thus realizing a comprehensive multi-dimensional evaluation of Android automated testing tool. To verify the effectiveness of CEAT, a set of 1,089 mutated applications is generated for testing, the experiments are deployed in a real-world cluster containing six mobile devices, and 5,040 test tasks are executed for the testing tools. The results suggest that: (i) the five indicators evaluate the automated testing tools from different perspectives, reflecting the testing performance of different tools in a more multi-dimensional way and validating the effectiveness of CEAT; (ii) CEAT supports testers to assign different weights to the five metrics and obtain comprehensive evaluation results depending on the practical testing requirements, which has certain flexibility; (iii) CEAT automatically reconstructs the APP to obtain mutant APPs and set a specific platform for testing the tool, making it convenient to operate. CEAT effectively provides a reference for testers to select the best Android automated testing tool according to different testing requirements.
YU Pu , SHU Hui , XIONG Xiao-Bing , KANG Fei
2023, 34(4):1650-1665. DOI: 10.13328/j.cnki.jos.006719
Abstract:At present, in the field of code protection technology research, traditional obfuscation methods have obvious obfuscation characteristics, and analysts can perform customized de-obfuscation processing based on these characteristics. For this reason, this study proposes a code protection technology based on code slice fusion. This technology slices the target code into code fragments according to grammatical rules at the source code level, and inserts the fragments into different positions of another program according to the execution order and grammatical rules. After repairing the function call process and the data relationship, the fusion code that can run the two code functions normally is formed. A comparative experiment for the fusion code is carried out from three perspectives, namely, resource overhead, code complexity impact, and code similarity. The test results demonstrate that the implicit code obfuscation technique based on code slice fusion can effectively obfuscate code semantics, change control flow characteristics, and has no obvious obfuscation characteristics. Therefore, fusion technology has obvious advantages in the ability to fight against multiple similarity comparison algorithms.
WANG Hong-Bing , YAN Jia , ZHANG Dan-Dan , LU Rong-Rong
2023, 34(4):1666-1694. DOI: 10.13328/j.cnki.jos.006721
Abstract:As a novel schema of software development, software crowdsourcing has been widely studied by academia and industry. Compared with traditional software development, software crowdsourcing makes the most use of developers all over the world to complete complex development tasks which can effectively reduce costs and improve efficiency. Nevertheless, because there are a large number of complex tasks in the current crowdsourcing platform and inaccurate task matching will affect the progress and quality of task solutions, it is very important to study the matching problem between developers and tasks. Therefore, this study utilizes the dynamic preferences and competitiveness features of developers and proposes a task recommendation model to recommend appropriate software development tasks for developers. First, the attention mechanism based-long short-term memory network is adopted to predict the current preference of a developer to screen out the top-N tasks that conform to the preference from the candidate tasks. On this basis, according to the developer’s competitiveness, differential evolution algorithm based-extreme gradient boosting is used to predict the developer’s scores of top-N tasks, thus further filtering out the top-K tasks with the highest scores to recommend to the developer. Finally, in order to verify the validity of the proposed model, a series of experiments is carried out to compare the existing methods. The experiment results illustrate that the proposed model has significant advantages in task recommendation in software crowdsourcing.
HU Tian-Xiang , XIE Rui , YE Wei , ZHANG Shi-Kun
2023, 34(4):1695-1710. DOI: 10.13328/j.cnki.jos.006723
Abstract:Code summarization generates brief natural language descriptions of source code pieces, which can assist developers to understand code and reduce documentation workload. Recent research efforts on code summarization mainly adopt deep learning models. Most of these models are trained on large datasets, consisting of independent code-summary pairs. Despite the technical advances, most of these works, referred as general code summarization models, ignore the project-level contextual information of code pieces and summaries, which developers would heavily rely on when writing documentation. This study investigates project-specific code summarization, a scenario that is much more consistent with human behavior and tool implementation of code summarization. Specifically, a novel deep learning approach is proposed that leverages highly relevant code pieces and their corresponding summaries to characterize contextual semantics, and integrates common knowledge learned from large-scale cross-project dataset via transfer learning. The dataset is created and released for project-specific code summarization, consisting of 800k method-summary pairs along with their lifecycle information for re-producing accurate code context. Experimental results on this dataset demonstrate that the proposed technique can not only gain huge improvement over general code summarization model, but also generates more consistent summaries within a project.
CONG Run-Min , ZHANG Chen , XU Mai , LIU Hong-Yu , ZHAO Yao
2023, 34(4):1711-1731. DOI: 10.13328/j.cnki.jos.006700
Abstract:Inspired by the human visual attention mechanism, salient object detection (SOD) aims to detect the most attractive and interesting object or region in a given scene. In recent years, with the development and popularization of depth cameras, depth map has been successfully applied to various computer vision tasks, which also provides new ideas for the salient object detection task at the same time. The introduction of depth map not only enables the computer to simulate the human visual system more comprehensively, but also provides new solutions for the detection of some difficult scenes, such as low contrast and complex backgrounds by utilizing the structure information and location information of the depth map. In view of the rapid development of RGB-D SOD task in the era of deep learning, this studyaims to sort out and summarize the existing related research outputs from the perspective of key scientific problem solutions, and conduct the quantitative analysis and qualitative comparison of different methods on the commonly used RGB-D SOD datasets. Finally, the challenges and prospects are summarized for the future development trends.
OUYANG Xiao , TAO Hong , FAN Rui-Dong , JIAO Yuan-Yuan , HOU Chen-Ping
2023, 34(4):1732-1748. DOI: 10.13328/j.cnki.jos.006703
Abstract:Multi-label learning is a very important machine learning paradigm. Traditional multi-label learning methods are designed in supervised or semi-supervised manner. Generally, they require accurate labeling of all or partial data into multiple categories. In many practical applications, it is difficult to obtain the label information with a large number of labels, which greatly restricts the promotion and application of multi-label learning. In contrast, label correlation, as a common weak supervision information, has lower requirements for labeling information. How to use label correlation for multi-label learning is an important but unstudied problem. This study proposes a method named weakly supervised multi-label learning using prior label correlation information (WSMLLC). This model restates the sample similarity by using label correlation, and can obtain label indicator matrix effectively, constrain the projection matrix of data by using prior information, and modify the indicator matrix by introducing regression terms. Compared with the existing methods, the outstanding advantage of WSMLLC model is that it can realize the label assignment of multi-label samples only by providing label correlation priors. Experimental results show that WSMLLC has obvious advantages over current advanced multi-label learning methods in the case of complete loss of label matrix.
YU Chao , DONG Yin-Zhao , GUO Xian , FENG Yang-He , ZHUO Han-Kui , ZHANG Qiang
2023, 34(4):1749-1764. DOI: 10.13328/j.cnki.jos.006708
Abstract:This study proposes structure-motivated interactive deep reinforcement learning (SMILE) method to solve the problems of low training efficiency and inexplicable strategy of deep reinforcement learning (DRL) in high-dimensional robot behavior control. First, the high-dimensional single robot control problem is transformed into a low-dimensional multi-controllers control problem according to some structural decomposition schemes, so as to solve the curse of dimensionality in continuous control. In addition, SMILE dynamically outputs the dependency among the controllers through two coordination graph (CG) models, ATTENTION and PODT, in order to realize the information exchange and coordinated learning among the internal joints of the robot. In order to balance the computational complexity and information redundancy of the above two CG models, two different models, APODT and PATTENTION, are then proposed to update the CG, which can realize the dynamic adaptation between the short-term dependency and long-term dependency among the controllers. The experimental results show that this kind of structurally decomposed learning can improve the learning efficiency substantially, and more explicit interpretations of the final learned policy can be achieved through the relational inference and coordinated learning among the components of a robot.
SUN Hai-Feng , MU Zheng-Yang , QI Qi , WANG Jing-Yu , LIU Cong , LIAO Jian-Xin
2023, 34(4):1765-1778. DOI: 10.13328/j.cnki.jos.006399
Abstract:Dense depth map is essential in areas such as autonomous driving and robotics, but today’s depth sensors can only produce sparse depth measurements. Therefore, it is necessary to complete it. In all auxiliary modalities, RGB images are commonly used and easily obtained. Many current methods use RGB and sparse depth information in depth completion. However, most of them simply use channel concatenation or element-wise addition to fuse the information of the two modalities, without considering the confidence of each modalities in different scenarios. This study proposes a dynamic gated fusion module, which is guided by the sparse distribution of input sparse depth and information of both RGB and sparse depth feature, thus fusing two modal features more efficiently by generating dynamic weights. And designed an efficient feature extraction structure according to the data characteristics of different modalities. Comprehensive experiments show the effectiveness of each model. And the network proposed in this paper uses lightweight model to achieve advanced results on two challenging public data sets KITTI depth completion and NYU depth v2. Which shows our method has a good balance of performance and speed.
WANG Ji-Kui , YANG Zheng-Guo , LIU Xue-Wen , YI Ji-Hai , LI Bing , NIE Fei-Ping
2023, 34(4):1779-1795. DOI: 10.13328/j.cnki.jos.006400
Abstract:High-dimensional data is widely adopted in the real world. However, there is usually plenty of redundant and noisy information existing in high-dimensional data, which accounts for the poor performance of many traditional clustering algorithms when clustering high-dimensional data. In practice, it is found that the cluster structure of high-dimensional data is often embedded in the lower dimensional subspace. Therefore, dimension reduction becomes the key technology of mining high-dimensional data. Among many dimension reduction methods, graph-based method becomes a research hotspot. However, most graph-based dimension reduction algorithms suffer from the following two problems: (1) most of the graph-based dimension reduction algorithms need to calculate or learn adjacency graphs, which have high computational complexity; (2) the purpose of dimension reduction is not considered in the process of dimension reduction. To address the problem, a fast unsupervised dimension reduction algorithm is proposed based on the maximum entropy-MEDR, which combines linear projection and the maximum entropy clustering model to find the potential optimal cluster structure of high-dimensional data embedded in low-dimensional subspace through an effective iterative optimization algorithm. The MEDR algorithm does not need the adjacency graph as an input in advance, and has linear time complexity of input data scale. A large number of experimental results on real datasets show that the MEDR algorithm can find a better projection matrix to project high-dimensional data into low-dimensional subspace compared with the traditional dimensionality reduction method, so that the projected data is conducive to clustering analysis.
HUANG Yu-Xin , YU Zheng-Tao , GUO Jun-Jun , YU Zhi-Qiang , GAO Fan-Ya
2023, 34(4):1796-1810. DOI: 10.13328/j.cnki.jos.006406
Abstract:Generating coherent topic descriptions from the user comments of case-related topics plays a significant role in quickly understanding the case-related news, which can be regarded as a multi-document summarization task based on user comments. However, these comments contain lots of noise, the crucial information for generating summaries is scattered in different comments, the sequence-to-sequence model tends to generate irrelevant and incorrect summaries. Based on these observations, this study presents a case-related topic summarization method based on the topic interaction graph, which reconstructs the user comments into a topic interaction graph. The motivation is that the graph can express the correlation between different user comments, which is useful to filter the key information in user comments. Specifically, the case elements are first extracted from the user comments, and then the topic interaction graph is constructed, which takes the case elements as the nodes and uses the sentences including these case elements as the node’s contents; then the graph transformer network is introduced to produce the representation of the graph. Finally, the summary is generated by using a standard transformer-based decoder. The experimental results on the collected case-related topic summarization corpus show that the proposed method effectively selects useful content and can generate coherent and factual topic summaries.
XIANG Yan , YU Zheng-Tao , GUO Jun-Jun , HUANG Yu-Xin , XIAN Yan-Tuan
2023, 34(4):1811-1823. DOI: 10.13328/j.cnki.jos.006408
Abstract:The identification of opinion targets in microblog is the basis of analyzing network public opinion involved in cases. At present, the identification method of opinion targets based on topic representation needs to preset a fixed number of topics, and the final results rely on artificial inference. In order to solve these problems, this study proposes a weak supervision method, which only uses a small number of labelled comments to automatically identify the opinion targets in microblog. The specific implementation is as follows. Firstly, the comments are encoded and reconstructed twice based on the variational dual topic representation network to obtain rich topic features. Secondly, a small number of labelled comments are used to guide the topic representation network to automatically identify the opinion targets. Finally, the reconstruction loss of double topic representation and the classification loss of opinion targets identification are optimized together by the joint training strategy, to classify comments of opinion targets automatically and mine target terms. Experiments are carried out on two data sets of microblogs involved in cases. The results show that the proposed model outperforms several baseline models in the classification of opinion targets, topic coherence, and diversity of target terms.
LIU Quan , YU Zheng-Tao , HE Shi-Zhu , LIU Kang , GAO Sheng-Xiang
2023, 34(4):1824-1836. DOI: 10.13328/j.cnki.jos.006412
Abstract:Question matching is an important task of question answering systems. Current methods usually use neural networks to model the semantic matching degree of two sentences. However, in the field of law, questions often have some problems, such as sparse textual representation, professional legal words, and insufficient legal knowledge contained in sentences. Therefore, the general domain deep learning text matching model is not effective in the legal question matching task. In order to make the model better understand the meaning of legal questions and model the knowledge of the legal field, this study firstly constructs a knowledge base of the legal field, and then proposes a question matching model integrating the knowledge of the legal field (such as legal words and statutes). Specifically, a legal dictionary under five categories of legal disputes has been constructed, including contract dispute, divorce, traffic accident, labor injury, debt and creditor’s right, and relevant legal articles have been collected to build a knowledge base in the legal field. In question matching, the legal knowledge base is first searched for the legal words and statutes corresponding to the question pair, and then the relationship among the question, legal words, and statutes is modeled simultaneously through the cross attention model. Finally, to achieve more accurate question matching, experiments under multiple legal categories were carried out, and the results show that the proposed method in this study can effectively improve the performance of question matching.
2023, 34(4):1837-1849. DOI: 10.13328/j.cnki.jos.006413
Abstract:Speech translation aims to translate the speech in one language into the speech or text in another language. Compared with the pipeline system, the end-to-end speech translation model has the advantages of low latency, less error propagation, and small storage, so it has attracted much attention. However, the end-to-end model not only requires to process the long speech sequence and extract the acoustic information, but also needs to learn the alignment relationship between the source speech and the target text, leading to modeling difficulty with poor performance. This study proposes an end-to-end speech translation model with cross-modal information fusion, which deeply combines text-based machine translation model with speech translation model. For the length inconsistency between the speech and the text, a redundancy filter is proposed to remove the redundant acoustic information, making the length of filtered acoustic representation consistent with the corresponding text. For learning the alignment relationship, the parameter sharing method is applied to embed the whole machine translation model into the speech translation model with multi-task training. Experimental results on public speech translation data sets show that the proposed method can significantly improve the model performance.
TAO Xin-Min , GUO Wen-Jie , LI Xiang-Ke , CHEN Wei , WU Yong-Kang
2023, 34(4):1850-1869. DOI: 10.13328/j.cnki.jos.006432
Abstract:In order to solve the dilemma that particle swarm optimization (PSO) cannot well balance the exploration and exploitation, a density peak based multi subpopulation particle swarm optimization algorithm is proposed with dimensionally reset strategy (DPMPSO). In the proposed DPMPSO, the idea of relative distance originated from density peak clustering is firstly adopted and then it is combined with the fitness value of particles to divide the whole swarm into two subpopulations: the top subpopulation and the bottom subpopulation. Secondly, the learning strategy is designed, focusing on local search for the top subpopulation and the learning strategy paying more attention to global search for the bottom subpopulation, which can well balance the exploration and exploitation. Finally, particles that fall into local optima will be reset by crossover with the global optima dimensionally, which can not only effectively avoid premature but also significantly reduce invalid iteration. The experiment results on 10 benchmark problems and CEC2017 optimization problems demonstrate that DPMPSO performs better than some representative PSOs and other optimization algorithms with significant difference.
GENG Chuan-Xing , TAN Zheng-Hao , CHEN Song-Can
2023, 34(4):1870-1878. DOI: 10.13328/j.cnki.jos.006433
Abstract:With the free supervised signals/labels created by pretext tasks, self-supervised learning (SSL) can learn effective representation from unlabeled data, which has been verified in various downstream tasks. Existing pretext tasks usually first perform explicit linear or nonlinear transformations on the original view data, thus forming multiple augmented view data, then learn the representation by predicting the corresponding transformations or maximizing the consistency among the above views. It is found that such self-supervised augmentations (i.e., the augmentations of the data itself and self-supervised labels) benefit the learning of not only the unsupervised pretext tasks but also the supervised classification task. Nevertheless, few work focus on this at present, while existing works either take the pretexts as the auxiliary of downstream classification task and adopt the multi-task learning or jointly model the downstream task labels and self-supervised labels in a multi-label learning way. Actually, there are inherent differences between downstream and pretext tasks (e.g., semantic, task difficulty, etc.), which inevitably result in the competitions between them and bring risks to the learning of downstream tasks. To challenge this issue, this study proposes a simple yet effective SSL multi-view learning framework (SSL-MV), which avoids the learning interference of self-supervised labels on downstream labels through performing the same learning as downstream tasks on the augmented data views. More interestingly, with the multi-view learning, the proposed framework naturally owns the integration inference ability, which significantly improves the performance of downstream supervised classification tasks. Extensive experiments on benchmark datasets demonstrate the effectiveness of SSL-MV.
GAO Ying , LI Han-Yu , WANG Wei , LIU Xiang , CHEN Jie
2023, 34(4):1879-1906. DOI: 10.13328/j.cnki.jos.006692
Abstract:With the rapid development of the Internet and the penetration of big data mining and applications into all walks of life, how to share and use massive data securely and efficiently has become a new hot research issue. Secure multi-party computation is one of the key technologies to solve this problem. It allows a group of participants to interact compute a function together, and get the output without revealing private inputs. Oblivious transfer is a privacy-protected two-party communication protocol in which a sender holds two messages to be sent, and a receiver selects one to receive, but after that, the sender knows nothing about which message the receiver gets, and the receiver cannot get any information about the unselected message. Oblivious transfer has become one of the key modules of secure multi-party computation, and its efficiency optimization can effectively promote the application of secure multi-party computation, especially for special two-party secure computation protocols such as private set intersection. This paper summarizes the classification of oblivious transfer and several common variants, and respectively describes the construction and research progress of the oblivious transfer protocol based on public key cryptography and oblivious transfer extension, which leads to the importance of the efficiency optimization research of oblivious transfer. At the same time, this paper comprehensively reviews the research progress of efficiency optimization of oblivious transfer and oblivious transfer extension from the perspectives of semi-honest adversary and malicious adversary. On the other hand, in practical application, this paper systematically summarizes the optimization technologies used in the engineering implementation of oblivious transfer and oblivious transfer extension protocols. Finally, this paper points out the main problems and future works of oblivious transfer and oblivious transfer extension protocols.
WANG Jing-Wei , NING Jian-Ting , XU Sheng-Min , YIN Xin-Chun , CHEN Hai-Xia
2023, 34(4):1907-1925. DOI: 10.13328/j.cnki.jos.006698
Abstract:To improve the performance of user revocation and ciphertext update, a searchable attribute-based encryption scheme for dynamic user groups is proposed. A binary tree is applied to manage the revocation list. The user revocation will be achieved by adding revoked users to the revocation list and informing the cloud server to update the ciphertexts. To relieve the limitation of the number of system users, the nodes of the user binary tree will be re-used by new users if the random values in the nodes could be updated when user revocation occurs. Besides, a ciphertext search function, based on bilinear pairing, is provided and all revoked users are not allowed to perform the search algorithm. The security analysis proves that the proposed scheme is IND-CPA secure under the random oracle model. The performance analysis shows that the proposed scheme outperforms other existing solutions in terms of computational overhead.
TIAN Zhi-Cheng , ZHANG Wei-Zhe , QIAO Yan-Chen , LIU Yang
2023, 34(4):1926-1943. DOI: 10.13328/j.cnki.jos.006722
Abstract:Deep learning has been used in the field of malware detection and achieved great results. However, recent research shows that deep learning models are not safe, and they are vulnerable to adversarial attacks. Attackers can make malware detectors give wrong output by making a few modifications to the malware without changing the original function, resulting in the omission of malware. To defend adversarial examples, the most commonly used method in previous work is adversarial training. Adversarial training requires generating a large number of adversarial examples to retrain the model, which is inefficient. Besides, the defense effect is limited by the adversarial example generation method used in training. As such, a new method is proposed to detect adversarial malware in PE format, aiming at the type of adversarial attacks that add modification to the function independent area of PE file. By using model interpretation techniques, the decision-making basis of the end-to-end malware detection model can be analyzed and the features of adversarial examples are extracted. Anomaly detection techniques are further used to identify adversarial examples. As an add-on module of the malware detection model, the proposed method does not require modifying the original model and does not need to retrain the model. Compared with other defense methods such as adversarial training, this method is more efficient and has better generalization ability which means it can defend against a variety of adversarial attack methods The proposed method is evaluated on a real-world dataset of malware. Promising results show that the method can effectively defend the adversarial attacks against the end-to-end PE format malware detection model.
GUO Jie , PAN Jin-Gui , GUO Yan-Wen
2023, 34(4):1944-1961. DOI: 10.13328/j.cnki.jos.006725
Abstract:Participating media are ubiquitous in nature and are also major elements in many rendering applications such as special effects, digital games, and simulation systems. Physically-based simulation and reproduction of their appearance can significantly boost the realism and immersion of 3D virtual scenes. However, both the underlying structures of participating media and the light propagation in them are very complex. Therefore, rendering with participating media is a difficult task and hot topic in computer graphics so far. In order to facilitate the treatment in rendering and lower the computational cost, classical methods for participating media rendering are always based on two assumptions: independent scattering and local continuity. These two assumptions are also the building blocks of classical radiative transfer equation (RTE). However, most participating media in nature do not satisfy these two assumptions. This results in the noticeable discrepancy between rendered images and real images. In recent years, these two assumptions have been relaxed by incorporating more physically accurate methods to model participating media, thus significantly improving the physical realism of participating media rendering. This survey analyzes and discusses existing non-classical participating media rendering techniques from two aspects: correlated media rendering and discrete media rendering. The differences between classical and non-classical participating media rendering are discussed. The principles, advantages, and limitations behind each method are also described. Finally, some future directions are provided around non-classical participating media rendering that are worth delving into. It is hoped that this survey can inspire researchers to study non-classical participating media rendering by addressing some critical issues. It is also hoped that this survey can be a guidance for engineers from industry to improve their renderers by considering non-classical participating media rendering.
WANG Chang-Shuo , WANG Han , NING Xin , TIAN Sheng-Wei , LI Wei-Jun
2023, 34(4):1962-1976. DOI: 10.13328/j.cnki.jos.006683
Abstract:The ability to describe local geometric shapes is very important for the representation of irregular point cloud. However, the existing network is still difficult to effectively capture accurate local shape information. This study simulates depthwise separable convolution calculation method in the point cloud and proposes a new type of convolution, namely dynamic cover convolution (DC-Conv), to aggregate local features. The core of DC-Conv is the space cover operator (SCOP), which constructs anisotropic spatial geometry in a local area to cover the local feature space to enhance the compactness of local features. DC-Conv achieves the capture of local shapes by dynamically combining multiple SCOPs in the local neighborhood. Among them, the attention coefficients of the SCOPs are adaptively learned from the point position in a data-driven way. Experiments on the 3D point cloud shape recognition benchmark dataset ModelNet40, ModelNet10, and ScanObjectNN show that this method can effectively improve the performance of 3D point cloud shape recognition and robustness to sparse point clouds even in the case of a single scale. Finally, sufficient ablation experiments are also provided to verify the effectiveness of the method. The open-source code is published at https://github.com/changshuowang/DC-CNN.
JIANG Xiao-Bin , XIONG Yi-Xiang , ZHANG Heng , WU Yan-Jun , ZHAO Chen
2023, 34(4):1977-1996. DOI: 10.13328/j.cnki.jos.006732
Abstract:Recently, with the increasing trend of data scale expansion and structure diversification, how to use the heterogeneous multi accelerators in modern link to provide a real-time and reliable parallel runtime environment for large-scale data processing has become a research hotspot in the field of high performance and database. Modern servers equipped with multi accelerators (GPU) has become the preferred high-performance platform for analyzing large-scale and irregular graph data. The overall performance of existing research designing graph computing systems and algorithms based on multi-GPU server architecture (such as breadth first traversal and shortest path algorithm) has been significantly better than that of multi-core CPU computing environment. However, the data transmission performance between multi-GPU of existing graph computing system is limited by PCI-E bandwidth and local delay, leading to being unable to achieve a linear growth trend of performance by increasing the number of GPU devices, and even serious delay jitter which cannot satisfy the high scalability requirements of large-scale graph parallel computing systems. After a series of benchmark experiments, it is found that the existing system has the following two types of defects. (1) The hardware architecture of the data link between modern GPU devices is rapidly updated (such as NVLink-V1 and NVLink-V2), and its link bandwidth and delay have been greatly improved. However, the existing systems are still limited by PCI-E for data communication, and cannot make full use of modern GPU link resources (including link topology, connectivity, and routing); (2) When dealing with irregular graph data, such systems often adopt single data movement strategy between devices, bringing a lot of unnecessary data synchronization overhead between GPU devices via PCI-E bus, resulting in excessive time-wait overhead of local computing. Therefore, it is urgent to make full use of various communication links between modern multi-GPU to design a highly scalable graph computing system. In order to achieve the high scalability of the multi-GPU graph computing system, a fine-grained communication based on hybrid perception is proposed to enhance the scalability of the multi-GPU graph computing system. It pre-awares the architecture link, uses the modular data link and communication strategy for different graph structured data, and finally selects the optimal data exchange method for large-scale graph data (structural data and application data). Based on above optimization strategies, this study proposes and designs a graph oriented parallel computing system via multi-GPU named ChattyGraph. By optimizing data buffer and multi-GPU collaborative computing with OpenMP and NCCL, ChattyGraph can adaptively and efficiently support various graph parallel computing applications and algorithms on multi-GPU HPC platform. Several experiments of various real-world graph data on 8-GPU NVIDIA DGX server show that ChattyGraph significantly improves graph computing efficiency and scalability, and outperforms other advanced competitors. The average computing efficiency is increased by 1.2×-1.5× and the average acceleration ratio is increased by 2×-3×, including WS-VR and Groute.