SUN Zhi-Xin , ZHANG Xin , XIANG Feng , CHEN Lu
2021, 32(1):1-20. DOI: 10.13328/j.cnki.jos.006111 CSTR:
Abstract:Blockchain is a technology that combines distributed consensus, encryption, timestamps, etc., to achieve peer-to-peer trading, coordination, and collaboration without relying on any third-party centralization organization. In recent years, the rapid development of blockchain technology has aroused great interest from industry and academia. However, the problem of storage scalability of blockchain has increased the threshold of blockchain devices and has become a bottleneck for blockchain applications. This paper introduces the basic principle and storage model of blockchain, and analyzes the storage problems faced by current blockchain. Then, for the problem of blockchain storage scalability, from two perspectives of off-chain storage and on-chain storage, the principles and ideas of the existing solutions are discussed. Finally, based on the research progress of the storage scalability of blockchain and the problems of these solutions, directions are provided for future research work.
2021, 32(1):21-40. DOI: 10.13328/j.cnki.jos.006121 CSTR:
Abstract:Artificial intelligence has been widely used in various scenarios due to its powerful learning and generalization ability. However, most of the existing AI techniques are facing three major challenges. First, existing AI techniques are hard to use for ordinary users, which depends on AI experts to select appropriate models, choose reasonable parameters and write programs, so it is difficult to be widely used in non-IT fields. Second, the training efficiency of existing AI algorithms is low, resulting in a lot of waste of computing resources, even delaying decision-making opportunities. Third, existing AI techniques are strongly dependent on high-quality data. If the data quality is low, it will make error decisions. The database technology can effectively solve these three problems, and AI-oriented data management has been widely studied. Firstly, this paper gives the overall framework of data management in AI. Then, it presents a detailed overview of AI-oriented declarative language model, AI-oriented optimization, AI-oriented execution engine, and AI-oriented data governance. Finally, the future research directions and challenges are provided.
JI Shou-Ling , DU Tian-Yu , LI Jin-Feng , SHEN Chao , LI Bo
2021, 32(1):41-67. DOI: 10.13328/j.cnki.jos.006131 CSTR:
Abstract:In the era of big data, breakthroughs in theories and technologies of deep learning, reinforcement learning, and distributed learning have provided strong support for machine learning at the data and the algorithm level, as well as have promoted the development of scale and industrialization of machine learning. However, though machine learning models have excellent performance in many real-world applications, they still suffer many security and privacy threats at the data, model, and application levels, which could be characterized by diversity, concealment, and dynamic evolution. The security and privacy issues of machine learning have attracted extensive attention from academia and industry. A large number of researchers have conducted in-depth research on the security and privacy issues of models from the perspective of attack and defense, and proposed a series of attack and defense methods. In this survey, the security and privacy issues of machine learning are reviewed, existing research work is systematically and scientifically summarized, and the advantages and disadvantages of current research are clarified. Finally, the current challenges and future research directions of machine learning model security and privacy research are explored, aiming to provide guidance for follow-up researchers to further promote the development and application of machine learning model security and privacy research.
GAO Han , TIAN Yu-Long , XU Feng-Yuan , ZHONG Sheng
2021, 32(1):68-92. DOI: 10.13328/j.cnki.jos.006096 CSTR:
Abstract:With the development of the amount of data available for training and the processing power of new computing platform, the intelligent model based on deep learning can accomplish more and more complex tasks, and it has made major breakthroughs in the field of AI such as computer vision and natural language processing. However, the large number of parameters of these deep models bring awesome computational overhead and memory requirements, which makes the big models must face great difficulties and challenges in the deployment of computing-capable platforms (such as mobile embedded devices). Therefore, model compression and acceleration without affecting the performance have become a research hotspot. This study first analyzes the classical deep learning model compression and acceleration methods proposed by domestic and international scholars, and summarize seven aspects:Parameter pruning, parameter quantization, compact network, knowledge distillation, low-rank decomposition, parameter sharing, and hybrid methods. Secondly, the compression and acceleration performance of several mainstream representative methods is compared on multiple public models. Finally, the future research directions in the field of model compression and acceleration are discussed.
WANG Qiang , JIANG Hao , YI Shu-Wen , YANG Lin-Tao , NAI He , NIE Qi
2021, 32(1):93-117. DOI: 10.13328/j.cnki.jos.006092 CSTR:
Abstract:Complex networks naturally exist in a wide diversity of real-world scenarios. Efficient complex network analysis technology has wide applications, such as community detection, link prediction, etc. However, most complex network analytics methods suffer high computation and space cost dealing with large-scale networks. Network representation learning is one of the most efficient methods to solve this problem. It converts high-dimensional sparse network information into low-dimensional dense real-valued vectors which can be easily exploited by machine learning algorithms. Simultaneously, it facilitates efficient computation for subsequent applications. The traditional network representation embeds the entity objects in the low dimensional Euclidean vector space, but recent work has shown that the appropriate isometric space for embedding complex networks with hierarchical or tree-like structures, power-law degree distributions and high clustering is the negatively curved hyperbolic space. This survey conducts a systematic introduction and review of the literature in hyperbolic representation learning for complex networks.
LI Ke-Xin , WANG Xing-Wei , YI Bo , HUANG Min , LIU Xiao-Jie
2021, 32(1):118-136. DOI: 10.13328/j.cnki.jos.006120 CSTR:
Abstract:In the past few years, artificial intelligence (AI) has attracted the attention of both academia and industry with strong momentum and has been widely utilized in various fields. Computer networks provide critical computing infrastructure for the realization of AI. However, it is inefficient to provide AI with computing power in a fast and accurate manner, because of the inherently distributed structure of traditional networks, and it results in the difficulty in practical application and deployment. Software defined networking (SDN) proposes the concept of centralized control, which adapts computing capability for AI on demand and thereby can achieve comprehensive deployment. Combining AI and SDN to realize intelligent software defined networking can not only solve problems of traditional network but also promote network application innovation. Therefore, this paper introduces the problems which exist in the scenario where combining AI and SDN, explains the necessity of SDN based on AI, and analyzes the advantages of combining SDN with AI. Secondly, from the bottom up, the different combination cases of AI and SDN are considered which include data plane, control plane, and application plane. Besides, the challenges and key technologies are introduced from three aspects:routing optimization, network security, and traffic engineering. Furthermore, the advantages and prospects of the intelligent software defined networking are analyzed via combining other emerging fields comparison, and some future research works are outlined.
DONG Chun-Tao , SHEN Qing-Ni , LUO Wu , WU Peng-Fei , WU Zhong-Hai
2021, 32(1):137-166. DOI: 10.13328/j.cnki.jos.006095 CSTR:
Abstract:Security and trustworthiness are extremely important requirements for cloud computing. How to protect critical application code and data on the cloud platform and prevent cloud service providers and other attackers from stealing user's confidential data is a difficult problem. In 2013, Intel proposed a new processor security technology SGX, which can provide a trusted execution environment for user space on the computing platform to ensure the confidentiality and integrity of critical user code and data. Since SGX technology is proposed, it has gradually become an important solution to cloud computing security issues. How to effectively apply SGX technology to protect application has become research a hotspot in recent years. In this paper, the mechanisms and SDK of SGX are introduced, and the bottlenecks such as security issues, performance bottlenecks, development difficulties, and function limitations faced by SGX application are summarized. The research progress of SGX application supporting techniques are analyzed and summarized, including SGX application security protection technique, SGX application performance optimization technique, SGX application assisted development technique, and SGX function extension technique. Finally, the development directions of SGX application supporting techniques are suggested.
LI Zhong , JIN Xiao-Long , ZHUANG Chuan-Zhi , SUN Zhi
2021, 32(1):167-193. DOI: 10.13328/j.cnki.jos.006100 CSTR:
Abstract:In recent years, with the popularization of Web 2.0, people pay more and more attentions to the graph anomaly detection. The graph anomaly detection plays an increasingly vital role in the field of fraud detection, intrusion detection, false voting, and zombie fan analysis. This paper presents a survey on existing approaches to address this problem and reviews the recent developed techniques to detect graph anomalies. The graph-oriented anomaly detection is divided into two types, the anomaly detection on static graph and the anomaly detection on dynamic graph. Existing work on static graph anomaly detection have identified two types of anomalies:One is individual anomaly that refers to the abnormal behaviors of individual entity, the other is group anomaly that occurs due to unusual patterns of groups. The anomaly on dynamic graph can be divided into three types:Isolated individual anomaly, group anomaly, and event anomaly. This paper introduces the current research progress of each kind of anomaly detection methods, and summarizes the key technologies, common frameworks, application fields, common data sets, and performance evaluation methods of graph-oriented anomaly detection. Finally, future research directions on graph anomaly detection are discussed.
LIU Xue-Hua , DING Li-Ping , ZHENG Tao , WU Jing-Zheng , LI Yan-Feng
2021, 32(1):194-217. DOI: 10.13328/j.cnki.jos.006105 CSTR:
Abstract:Locating the source of cyber attack and then collecting digital evidence is one of the tasks of network forensics. Cyber attack traceback techniques are used to locate the source of cyber attack. However, current research on cyber attack traceback is mainly conducted from a defensive perspective, targeting at blocking cyber attack as soon as possible via locating the cyber attack source, and rarely considers digital evidence acquirement. As a result, the large amount of valuable digital evidence generated during the process of cyber attack traceback cannot be used in prosecutions, and their value in network forensics cannot be fully exploited. Therefore, a set of forensics capability metrics is proposed to assess the forensics capability of cyber attack traceback techniques. The latest cyber attack traceback techniques, including cyber attack traceback based on software defined network, are summarized and analyzed. Their forensics capability is analyzed and some suggestions are provided for improvement. At last, a specific forensics process model for cyber attack traceback is proposed. The work of this paper provides reference for research on cyber attack traceback technology targeting at network forensics.
ZHAO Hui , WANG Liang-Min , SHEN Tu-Hao , HUANG Lei , NI Xiao-Ling
2021, 32(1):218-245. DOI: 10.13328/j.cnki.jos.006103 CSTR:
Abstract:The desire to protect privacy in cyberspace has promoted the design of anonymous communication systems. Anonymity ensures that users do not expose sensitive information such as identity and communication relationship when using Internet services. Different anonymous communication systems provide different strength of anonymity protection. How to quantify and compare the degree of anonymity has been an important research topic since its beginning. And now it is getting more and more attention, and becoming a new research focus, more researches and applications are necessary to be carried out. Anonymity metrics can help users understand the levels of protection achieved through anonymous communication systems, and help developers gain objective and scientific basis for designing and improving anonymous communication systems. A general framework for anonymity metrics research is presented, including anonymous communication technology, attacks against anonymous communication technology, anonymity metrics, and their relationships. This study surveyed the researches in anonymity metric field, looking for their development and characteristics. A variety of theories and methods of anonymity metrics were reviewed and summarized following the time line. Considering attacks against anonymous communication, the characteristics and mutual relations of typical metric methods were sorted out, analyzed, and compared. And the new progresses were introduced, looking forward to the research direction and development trend. The analysis shows that anonymity metrics help to determine whether anonymous communication systems can provide promised anonymity, the metrics are becoming more diverse, and the metrics based on information theory are the most widely used. With the large-scale deployment of anonymous communication systems such as Tor, evaluations for real practical systems and infrastructures based on statistical data have emerged. New anonymous technologies have been developed rapidly recently, how to extend the metrics to the emerging technologies and how to combine different metrics to adapt to the emerging systems are the new research directions with solid application prospects.