• Volume 33,Issue 1,2022 Table of Contents
    Select All
    Display Type: |
    • >Review Articles
    • Selection of Open Source License: Challenges and Influencing Factors

      2022, 33(1):1-25. DOI: 10.13328/j.cnki.jos.006279 CSTR:

      Abstract (2961) HTML (4350) PDF 7.76 M (6092) Comment (0) Favorites

      Abstract:Developers usually select different open source licenses to restrain the conditions of using open source software, in order to protect intellectual property rights effectively and maintain the long-term development of the software. However, since the open source community has a wide variety of licenses available, developers generally find it difficult to understand the differences between different open source licenses. And existing selection tools of open source license require developers to understand the terms of the open source license and identify their business needs, which makes it harder for developers to make the right choice. Although there has been extensive research on open source license, there is still no systematic analysis on the actual difficulties of the developers to choose the open source license, thus lacking a clear understanding. For this reason, this study attempts to understand the difficulties faced by open source developers in choosing open source licenses, analyzes the components of open source license and the factors influencing open source license selection, and provides references for developers to choose open source licenses. This study conducts a random survey of 200 developers that participated in the open source projects on GitHub through questionnaires. With a Thematic Synthesis on the 53 feedbacks, it is found that developers often face difficulties in the selection of open source licenses in terms of complexity of terms and unknown considerations. By analyzing the ten open source licenses most widely used in 3 346 168 repositories on GitHub, this study establishes a framework of open source licenses that contains 10 dimensions. Drawing on the Theory of Planned Behavior, nine factors that affect license selection from three aspects are put forward: behavior attitude, subjective norm, and perceived behavior control. The relevance of those factors is verified by developer survey. Furthermore, the relationship between project characteristics and license selection is verified by fitting the order regression model. The results of research can deepen developers’ understanding of the contents of open source licenses, provide decision support for developers to select appropriate licenses based on their own needs, and provide a reference for implementing open source license selection tools based on developers’ needs.

    • Self-admitted Technical Debt Research: Problem, Progress, and Challenges

      2022, 33(1):26-54. DOI: 10.13328/j.cnki.jos.006292 CSTR:

      Abstract (2953) HTML (5259) PDF 10.25 M (6710) Comment (0) Favorites

      Abstract:Technical debt is a metaphor that refers to sacrifice the long-term code quality to satisfy the short-term goals. In particular, the technical debts introduced intentionally by developers are called self-admitted technical debt (SATD), which usually exist in software projects in the form of code comments. The SATDs bring great challenges to quality and robustness of software. In order to facilitate finding and paying back them as soon as possible for assuring software quality, in recent years, great progress has been made in the field of investigating the characteristics of SATD and proposing the identification models for SATD. Nevertheless, it is still challenging to apply them in practice. This paper offers a systematic survey of recent research achievements in SATD. First, the research problems are introduced in this field. Then, the current main research work is described in detail. After that, related techniques are discussed. Finally, the opportunities and challenges in this field are summarized and the research directions in the future are outlined.

    • Survey on Neural Network-based Automatic Source Code Summarization Technologies

      2022, 33(1):55-77. DOI: 10.13328/j.cnki.jos.006337 CSTR:

      Abstract (2721) HTML (4419) PDF 5.36 M (5391) Comment (0) Favorites

      Abstract:Source code summaries can help software developers comprehend programs faster and better, and assist maintenance developers in accomplishing their tasks efficiently. Since writing summaries by programmers is of high cost and low efficiency, researchers have tried to summarize source code automatically. In recent years, the technologies of neural network-based automatic summarization of source code have become the mainstream techniques of automatic source code summarization, and it is a hot research topic in the domain of intelligent software engineering. Firstly, this paper describes the concept of source code summarization and the definition of automatic source code summarization, presents its development history, and reviews the methods and metrics of the quality evaluation of the generated summaries. Then, it analyzes the general framework and the main challenges of neural network-based automatic code summarization algorithms. In addition, it focuses on the classification of representative algorithms, the design principle, characteristics, and restrictions of each category of algorithms. Finally, it discusses and looks forward to the trends on techniques of neural network-based source code summarization in future.

    • Context-sensitive Pointer Analysis for Object-oriented Programs: A Systematic Literature Review

      2022, 33(1):78-101. DOI: 10.13328/j.cnki.jos.006345 CSTR:

      Abstract (2011) HTML (4695) PDF 8.23 M (4722) Comment (0) Favorites

      Abstract:Pointer analysis is the basis of compiler optimization and static analysis, and a lot of applications are based on pointer analysis. Low-precision pointer analysis brings high false positive rate and false negative rate to these applications, and adding context sensitive information is an important means to improve accuracy. Since the object-oriented concept was put forward, it has been widely used. Some mainstream languages, such as Java, C++, .NET and C#, support object-oriented features. Therefore, pointer analysis for object-oriented language is getting more and more attention. This study investigates context-sensitive pointer analysis for object-oriented language by using systematic literature review (SLR) method. After analyzing and categorizing the relevant literature, five questions are summarized about context-sensitive pointer analysis for object-oriented language.

    • Research on Text Representation in Natural Language Processing

      2022, 33(1):102-128. DOI: 10.13328/j.cnki.jos.006304 CSTR:

      Abstract (4007) HTML (6882) PDF 6.89 M (8431) Comment (0) Favorites

      Abstract:Natural language processing is the core technology of artificial intelligence. Text representation is the basic and necessary work of natural language processing, which affects or even determines the quality and performance of natural language processing systems. This study discusses the basic principle of text representation, the formalization of natural language, the language model, and the connotation and extension of text representation. The technical classification of text representation on a macro level is analyzed. The mainstreams of text representation technologies and methods are analyzed, induced and summarized, including vector space model, topic model, graph-based model, neural network-based model, and representation learning. Event-based, semantic-based, and knowledge-based text representation technologies are also introduced. The development trends and directions of text representation technology are predicted and further discussed. Neural network-based deep learning and representation learning on text will play an important role in natural language processing. The strategy of pre-training and fine-tune optimization will gradually become the mainstream technology. Text representation needs specific analysis according to specific problems. The integration of technology and application is the driving force.

    • Recent Advances in Neural Architecture Search: A Survey

      2022, 33(1):129-149. DOI: 10.13328/j.cnki.jos.006306 CSTR:

      Abstract (3718) HTML (5071) PDF 7.60 M (9408) Comment (0) Favorites

      Abstract:In recent years, deep neural networks (DNNs) have achieved outstanding performance on many AI tasks, such as computer vision (CV) and natural language processing (NLP). However, the network design relies heavily on the expert knowledge, which is time-consuming and error-prone. As a result, as one of the important sub-fields of automated machine learning (AutoML), the neural architecture search (NAS) has been paid more and more attention to, aiming to automatically design deep neural networks with superior performance. In this study, the development process of NAS is reviewed in detail and systematically summarized. Firstly, the overall research framework of NAS is given, and the function of each research content is analyzed. Next, according to the development stage in NAS field, the existing methods are divided into four aspects, and the characteristic of each stage is introduced in detail. Then, the datasets are introduced which are often used to verify the effect of NAS methods at this stage, and the normalized evaluation criteria in NAS field are innovatively summarized, so as to ensure the fairness of experimental comparison and promote the long-term development of this field. Finally, the challenges of NAS research are proposed and discussed.

    • Survey on Large-scale Graph Neural Network Systems

      2022, 33(1):150-170. DOI: 10.13328/j.cnki.jos.006311 CSTR:

      Abstract (6838) HTML (6676) PDF 3.57 M (14226) Comment (0) Favorites

      Abstract:Graph neural network (GNN) is used to process graph structure data based on deep learning techniques. It combines graph propagation operations with deep learning algorithms to fully utilize graph structure information and vertex features in the learning process. GNNs have been widely used in a range of applications, such as node classification, graph classification, and link prediction, showing promised effectiveness and interpretability. However, the existing deep learning frameworks (such as TensorFlow and PyTorch) do not provide efficient storage support and message passing support for GNN’s training, which limits its usage on large-scale graph data. At present, a number of large-scale GNN systems have been designed by considering the data characteristics of graph structure and the computational characteristics of GNNs. This study first briefly reviews the GNNs, and summarizes the challenges that need to be faced in designing GNN systems. Then, the existing work on GNN training systems is reviewed, and these systems are analyzed from multiple aspects such as system architecture, programming model, message passing optimization, graph partitioning strategy and communication optimization. Finally, several open source GNN systems are chosen for experimental evaluation to compare these systems in terms of accuracy, efficiency, and scalability.

    • Survey on Graph Classification

      2022, 33(1):171-192. DOI: 10.13328/j.cnki.jos.006323 CSTR:

      Abstract (5003) HTML (5975) PDF 6.79 M (9118) Comment (0) Favorites

      Abstract:Graph data, as a kind of widely-existing data in the real world, naturally represent complex interactions between elements of composite objects. The classification of graph data is a very important and extremely challenging research topic. There are many key applications in the fields of bio/chemical informatics, such as molecular attribute classification and drug discovery. However, there still lacks a comprehensive review of research on graph classification. This survey first formulates the problem of graph classification and describes the main challenges of this problem; then this survey categorizes graph classification methods into similarity-based methods and graph neural network based methods. Moreover, evaluation metrics for graph classification, benchmark datasets, and comparison results are given. Finally, the application scenarios of graph classifications are summarized, and the research trends of graph classification are also discussed.

    • Survey on Deep Learning Image Recognition in Dilemma of Small Samples

      2022, 33(1):193-210. DOI: 10.13328/j.cnki.jos.006342 CSTR:

      Abstract (4953) HTML (4424) PDF 8.07 M (8528) Comment (0) Favorites

      Abstract:Present machine learning methods have reached a higher level than human intelligence in image recognition and other tasks. However, recent machine learning methods, especially deep learning methods, rely heavily on a large number of annotation data that human cognition often does not need. This weakness greatly limits the application of deep learning method in practical problem. To solve this problem, learning from a few shot examples attracts more and more community’s research interest. In order to better understand the few shot learning problem, this study extensively discusses several popular few shot learning methods, including data augmentation methods, transfer learning methods, and meta learning methods. After the processes and core ingredients of different algorithms are discussed, the advantages and disadvantages of existing methods could be clearly seen in solving few shot learning problems. At the end of this paper, the points to future research directions are highlighted in the field of few shot learning problem.

    • Survey on Domain Name System Measurement Research

      2022, 33(1):211-232. DOI: 10.13328/j.cnki.jos.006218 CSTR:

      Abstract (2292) HTML (3878) PDF 6.34 M (5689) Comment (0) Favorites

      Abstract:Domain name system (DNS) measurement research is an important way to understand DNS. This paper reviews the DNS measurement work during 1992 and 2019 on 18 topics from four aspects of components, structure, traffic, and security. Firstly, in the aspect of components, the four resolver-related topics are on public resolver, open resolver, resolver caching, and resolver selection policy; the four authoritative-server-related topics are on performance, anycast deployment, hosting, and misconfigurations. Secondly, in the aspect of structure, there are three topics: the dependency structure between stub resolvers and resolvers, the dependency structure of resolvers, and the dependency structure of domain name resolution. Then, in the aspect of traffic, there are three topics: query traffic characteristics, abnormal root query traffic, and traffic interception. Moreover, in the aspect of security, there are four topics: DNSSEC cost and risk, DNSSEC deployment, DNS encryption deployment, and malicious domain name detection. Finally, future research topics are discussed.

    • Survey on Offchain Channel Routing Algorithm

      2022, 33(1):233-253. DOI: 10.13328/j.cnki.jos.006219 CSTR:

      Abstract (2876) HTML (2884) PDF 9.44 M (5157) Comment (0) Favorites

      Abstract:Offchain channel network (OCN) can effectively improve the performance of blockchain system. The key component for OCN to achieve long-term efficient and stable operation is routing algorithm. This study proposes OCN architecture and the basic model of offchain channel routing algorithm. From perspectives of single-path routing and multi-path routing, typical routing algorithms are systematically reviewed and discussed. Meanwhile, an evaluation system is established for offchain channel routing algorithm, in terms of effectiveness, concurrency, scalability, channel balance, routing centralization, cost-effectiveness, privacy protection, goodput, latency, success rate, and efficiency. Finally, these algorithms are compared, and challenging research issues and technology trends of offchain routing algorithm are discussed.

    • Research Progress of Network Protocol Reverse Engineering Technologies Based on Network Trace

      2022, 33(1):254-273. DOI: 10.13328/j.cnki.jos.006303 CSTR:

      Abstract (2463) HTML (4988) PDF 8.23 M (5563) Comment (0) Favorites

      Abstract:Protocol reverse engineering is widely used in intrusion detection system, deep packet inspection, fuzzy testing, C & C malware detection, and other fields. First, the formal definition and basic principle of protocol reverse engineering are given. Then, the existing protocol reverse methods based on network trace are analyzed in detail from two aspects of protocol format extraction and protocol state machine inference. In addition, the basic modules, main principles, and characteristics of these algorithms are explained. Finally, the existing algorithms are compared from several aspects, and the development trend of protocol reverse technology is discussed.

    • State-of-the-art Survey on Network Behavior Emulation

      2022, 33(1):274-296. DOI: 10.13328/j.cnki.jos.006338 CSTR:

      Abstract (2520) HTML (5995) PDF 14.32 M (6344) Comment (0) Favorites

      Abstract:The network behavior typically describes the interaction process among different kinds of network elements, which is based on different kinds of network service protocols and applications, formulates evolving and diverse network behavior, and reflects attributes of network scenarios during certain periods on the network topology. Network behavior emulation includes runtime framework, background traffic emulation, and foreground traffic emulation which project network behaviors in the production network environment to the test cyber environment, and provides the mirroring capability of on-demand and flexible design specifications. The application scenarios of network behavior emulation continuously evolve, including performance analysis and evaluation, product and technique evaluation, network intrusion detection, and the research and development of network attack and defense techniques. To summarize existing research results and limitations, and analyze future development trends, this study seeks to category relevant definitions and research frameworks on simulating network behaviors, summarizes the state-of-the-art research progress in terms of the framework, background traffic, and foreground traffic, and systematically surveys both commercial and open-sourced software tools. Finally, this study proposes future research topics on network behavior simulation.

    • Survey on RFID-based Battery-less Sensing

      2022, 33(1):297-323. DOI: 10.13328/j.cnki.jos.006344 CSTR:

      Abstract (2449) HTML (4738) PDF 9.09 M (5754) Comment (0) Favorites

      Abstract:With the rapid development and deployments of the Internet of Things (IoT) technology, the demands of IoT applications have changed from the connections of the ubiquitous passive objects to the fusion among “human-computer-objects”. As one of the key technologies in IoT, radio frequency identification (RFID) becomes one significant intermediary of battery-less sensing, due to the lightweight, labelling, and easy deployment of the RFID tags. In order to understand the research progress and methods, this study focuses on the battery-less sensing research based on RFID technology. Particularly, this paper describes and analyzes the research work on four aspects: signal sources, sensing modes, sensing targets, and application scenarios, according to the working flow of sensing research. This paper introduces the research progress in RFID-based sensing from these four aspects, and also discusses the advantages and disadvantages of different technologies among the four aspects. Finally, the existing research is summarized and promising directions are presented for future research.

    • Overview on Typical Security Problems in Public Blockchain Applications

      2022, 33(1):324-355. DOI: 10.13328/j.cnki.jos.006280 CSTR:

      Abstract (3467) HTML (4306) PDF 12.18 M (7089) Comment (0) Favorites

      Abstract:Originated as Internet financial technology, blockchain is prevailing in many application scenarios and attracting attentions from both academia and industry. Typical blockchain systems are characterized with decentralization, trustworthiness, openness, autonomy, anonymity, and immutability, which brings trustworthiness for data management and value exchange in distributed computation environment without centralized trust authority. However, blockchain is still developing as a continuously evolving new technique. Its mechanisms, peripheral facilities, and user maturity in security are yet to be optimized, resulting in various security threats and frequent security incidents. This paper first overviews the blockchain technology and its potential security vulnerabilities when being used for token transaction and exchange. Then the mostly-seen security problems are enumerated and analyzed with Bitcoin and Ethereum as two sample systems. The security problems encountered by blockchain peripheral facilities and users are presented, and their root causes are probed. Finally, the surveyed problems are categorized and the possible countermeasures or defenses are proposed to address them. Promising research areas and technology evolving directions are briefly covered for the future.

    • State-of-the-art Survey on Photorealistic Rendering of 3D Sences Based on Machine Learning

      2022, 33(1):356-376. DOI: 10.13328/j.cnki.jos.006334 CSTR:

      Abstract (2380) HTML (4146) PDF 12.42 M (5741) Comment (0) Favorites

      Abstract:Nowadays, the demand for photorealistic rendering in the movie, anime, game, and other industries is increasing, and the highly realistic rendering of 3D scenes usually requires a lot of calculation time and storage to calculate global illumination. How to ensure the quality of rendering on the premise of improving drawing speed is still one of the core and hot issues in the field of graphics. The data-driven machine learning method has opened up a new approach. In recent years, researchers have mapped a variety of highly realistic rendering methods to machine learning problems, thereby greatly reducing the computational cost. This article summarizes and analyzes the research progress of highly realistic rendering methods based on machine learning in recent years, including: global illumination optimization calculation methods based on machine learning, physical material modeling methods based on deep learning, and participatory media drawing method optimization based on deep learning, Monte Carlo denoising method based on machine learning, etc. This article discusses the mapping ideas of various drawing methods and machine learning methods in detail, summarizes the construction methods of network models and training data sets, and conducts comparative analysis on drawing quality, drawing time, network capabilities, and other aspects. Finally, this article proposes possible ideas and future prospects for the combination of machine learning and realistic rendering.

Current Issue


Volume , No.

Table of Contents

Archive

Volume

Issue

联系方式
  • 《Journal of Software 》
  • 主办单位:Institute of Software, CAS, China
  • 邮编:100190
  • 电话:010-62562563
  • 电子邮箱:jos@iscas.ac.cn
  • 网址:https://www.jos.org.cn
  • 刊号:ISSN 1000-9825
  •           CN 11-2560/TP
  • 国内定价:70元
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063