LI Zhou-Jun , SHEN Dong , SU Xiao-Jing , MA Jin-Xin
2017, 28(9):2229-2247. DOI: 10.13328/j.cnki.jos.005185
Abstract:In recent years, with the growth in the number of mobile platform users, mobile platform security has become the focal point in the field of information security. The virtualization extension of ARM, which facilitates the security of mobile platform based on virtualization technology, is a hot research topic. This paper first introduces the types of virtualization technology and previous related studies. Then the concepts of ARM virtualization extension are presented, and the comparison with the x86 virtualization extension is given as well. Subsequently, the paper focuses on the current situation of security research based on hardware virtualization extension, including the general system frameworks and security tools for specific attacks. Analysis of future's research trend of ARM virtualization-based security technology is put forward at the end.
MIAO Xiao-Chuan , WANG Rui , XU Lei , ZHANG Wei-Feng , XU Bao-Wen
2017, 28(9):2248-2263. DOI: 10.13328/j.cnki.jos.005177
Abstract:Android system dominates the mobile operating systems at present. Compared with iOS system, Android system is more open and has lots of third-party markets with loose audit mechanism. Therefore, there are more malwares in Android platform. In this paper, an Android security analysis based on sensitive path identification, which includes the static analysis and machine learning methods, is presented. Firstly, since malicious behaviors in malwares have their trigger conditions, the definition of sensitive path is provided. Secondly, a method is proposed to generate the inter-component call graph based on APK files base in the fact that there are a lot of inter-component call relations in Android applications. Thirdly, since the sensitive paths cannot be directly used as features, a method is designed to abstract the sensitive paths. Finally, 493 applications APK files are collected from Android markets and the existing data sets, such as Google Play, Wandoujia and Drebin, to construct a benchmark. Experiments indicate that the proposed method has higher accuracy (97.97%) than the method based on API-feature (90.47%), and its precision, recall and F-measure are also better than API-feature method. Furthermore, the scale of the APK file has influence to the experiment results, especially in analyzing time (when the APK files are within 0-4MB, the average analyzing time is 89 seconds; and when the files become larger, the time increases significantly).
LI Cheng-Ze , YU Jian-Bo , ZHANG Miao , XU Guo-Ai , KONG Hao-Hao
2017, 28(9):2264-2280. DOI: 10.13328/j.cnki.jos.005181
Abstract:Binary obfuscation plays an essential role in evading malware analysis and tampering with reverse engineering. Some widely used code obfuscation techniques focus on evading syntax based detection, however semantic analysis techniques have been developed to thwart their evasion attempts. Recently some binary obfuscation techniques with potential of evading both statistical and semantic detections have been proposed, taking concealment into account but lacking efficiency or security strength. This study proposes a binary obfuscation technique for mobile apps based on LZW and Huffman encoding to offer the potential of evading both statistical and semantic detections while taking intensity and concealment into account. This technique constructs the required instruction encoding tables. On one hand, it scrambles the sequence of original instructions with encoding tables to improve the intensity and concealment. On the other hand, it reinforces intensity by separating the encoding tables encrypted by white-box AES from code segment, concealing the key and lookup algorithm, in order to evading attacks on keys. A prototype tool for this technique, called ObfusDroid, is put forward, and an evaluation on ObfusDroid is given from aspects of intensity, cost, compatibility and concealment to demonstrate its capability of evading statistical analysis.
HAN Shu-Min , SHEN De-Rong , NIE Tie-Zheng , KOU Yue , YU Ge
2017, 28(9):2281-2292. DOI: 10.13328/j.cnki.jos.005187
Abstract:Multi-party privacy-preserving record linkage is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. With the increasing amount of data and the real-world data quality issues (such as spelling errors and wrong order), scalability and fault tolerance of PPRL have become the main challenges. At present, most of the existing multi-party PPRL methods apply exact match without fault-tolerant. There are a few other PPRL approximate methods with fault-tolerant, but when dealing with the existing data quality issues, due to the low fault-tolerance and high time cost, they cannot effectively find out the common entities between databases. To tackle this issue, this paper proposes a multi-party PPRL approximate approach combined with bloom filter, secure summation, dynamic threshold, check mechanism, and improved Dice similarity function. First, bloom filter is used to convert each record in the databases to an array of 1 and 0. Then, ratio of bit 1 is calculated for each corresponding position, and dynamic threshold and check mechanism are used to determine matched position.Finally, the similarity between records is calculated by improved Dice similarity function to judge whether records are matched. Experimental results show the proposed method has good scalability and higher fault tolerance than the existing multi-party PPRL approximate method with good precision.
CUI Yi-HUI , SONG Wei , WANG Zhan-Bing , SHI Cheng-Liang , CHENG Fang-Quan
2017, 28(9):2293-2308. DOI: 10.13328/j.cnki.jos.005183
Abstract:Due to the various advantages of cloud computing, users tend to outsource data mining task to professional cloud service providers. However, user's privacy cannot be guaranteed. Currently, while many scholars are concerned about how to protect sensitive data from unauthorized access, few scholars engage research on data analysis. But if potential knowledge cannot be mined, the value of big data may not be fully utilized. This paper proposes a privacy preserving data mining (PPDM) method based on lattice, which support ciphertext intermediate point and distance homomorphic computing. Meanwhile, it builds a privacy preserving cloud ciphertext data clustering data mining Method. Users encrypt privacy data before outsource the data to cloud service providers, cloud service providers use homomorphic encryption to achieve privacy protection mining algorithms including k-means, hierarchical clustering and DBSCAN. Compared with the existing PPDM method, the presented method with high security is based on shortest vector difficulties (SVP) and the closest vector problem (CVP). Meanwhile, it maintains the accuracy of distance between two data, providing mining results with high accuracy and availability. Experiments are designed for the privacy preserving cluster mining (PPCM) with cardiac arrhythmia datasets of machine learning, and the experimental results show that the method based on lattice ensure not only security but also accuracy and performance.
WU Gen-Qiang , HE Ye-Ping , XIA Xian-Yao
2017, 28(9):2309-2322. DOI: 10.13328/j.cnki.jos.005184
Abstract:The optimal differentially private mechanism problem is to maximize the data utility on a fixed privacy protection extent. The optimal mechanism problem is an important topic in differential privacy, which has close connection with both theoretical foundation and future applications of differential privacy model. This paper proposes a analyzing method about the topic, which is not based on the sensitivity method. First, the optimal mechanism problem is constructed to be a multi-objective optimization problem, and a new method for constructing differentially private mechanism is introduced. Then, a near-optimal mechanism is provided for the linear queries, which reaches the boundary of the differential privacy inequality. Although this paper focuses on the linear queries, most part of the analyzing method introduced is applicable to the non-linear queries. This paper finds the drawback of the sensitivity method and uncovers some deeper characteristics of differential privacy.
JIANG Huo-Wen , ZHAN Qing-Hua , LIU Wen-Juan , MA Hai-Ying
2017, 28(9):2323-2333. DOI: 10.13328/j.cnki.jos.005178
Abstract:A huge amount of information in social network has accumulated into a kind of big graph data. Generally, to prevent privacy leakage, the data to be published need to be anonymized. Most of the existing anonymization scheme cannot prevent such attacks by background knowledge of both structure and attribute information among nodes. To address the issue, this investigation proposes a clustering-anonymization method for attribute-graph based on link edges and attributes value among nodes. Firstly, the data in the social network is represented by attribute graph. Then all the nodes of this attribute graph are clustered into certain super-nodes according to structural and attribute similarity between two nodes, each of which contains no less than k nodes. Finally, all the super-nodes are anonymized. In this method, the structure masking and attribute generalization for every super-nodes can respectively prevent all the recognition attacks by background knowledge of goals' linkages and attribute information. In addition, it balances the closeness of links among nodes and proximity of attributes value during clustering, therefore can reduce the total loss of information triggered by masking and generalization to maintain the availability of these graph data. Experiment results also demonstrate the approach achieves great algorithm performance and reduces information loss remarkably.
YANG Teng-Fei , SHEN Pei-Song , TIAN Xue , FENG Rong-Quan
2017, 28(9):2334-2353. DOI: 10.13328/j.cnki.jos.005182
Abstract:With the popularity of cloud computing, the security and manageability of cloud data faces new challenges. Object-based storage cluster is a cloud computing architecture, which is usually used to store classified and graded unstructured data. Under the premise of untrusted cloud service, how to achieve practicable fine-grained access control mechanism of massive classified and graded data while protecting data from unauthorized access, is an urgent issue to be handled. The proposed methods in recent years offer no effective ways to solve this new problem. By taking full advantages of mandatory access control method, attribute-based encryption and object storage technology, and by combining with the characters of classified and graded data, this paper proposes a hierarchical secure label-based access control model in object cloud. Similarly, the core algorithm in this model, which is called CGAC and provably secure, provides a method to embed the hierarchical feature of classified and graded attributes into ABE mechanism, and get constant-size ciphertext. This algorithm not only has flexible access policy and hierarchical authorization structure, but also combines the benefits of metadata management of object storage. Finally, through the theoretical analysis and experimental system implementation, the paper verifies that the model's computation cost in encryption and decryption is acceptable, confirming the proposed method has high practical significance.
YOU Jing , SHANG-GUAN Jing-Lun , XU Shou-Kun , LI Qian-Mu , WANG Yin-Hai
2017, 28(9):2354-2369. DOI: 10.13328/j.cnki.jos.005180
Abstract:Distributed dynamic trust model, as a new access management mechanism which applies to cloud computing environment, has been studied extensively. However, many existing trust models ignore the reliability of trust data and lead to failure when facing malicious recommendation. To solve this problem, this article proposes a new model named DDTM-TR (distributed dynamic trust management model based on trust reliability). Firstly, in order to reduce the bad effects of unreliable data on direct trust, recommendation trust and integrated trust, the reliability of trust data is evaluated. Second, several candidate nodes are selected and their integrated trust values are calculated, and one of them is selected randomly according to the stochastic decision algorithm based on their integrated trust values. Finally, the node reliability is updated according to the feedback after the interaction. The experiments demonstrate that DDTM-TR model performs better than the compared models in resisting malicious service and malicious recommendation, and the failure rate can be reduced further by the feedback algorithm.
ZHANG Wei-Wei , GONG Jian , LIU Shang-Dong , HU Xiao-Yan
2017, 28(9):2370-2387. DOI: 10.13328/j.cnki.jos.005186
Abstract:Focusing on ISP backbone, this paper presents a method to detect malicious activities such as botnets, phishing and spam that threaten user security in the domain by monitoring DNS interaction messages through the network boundary in real time. The method depicts DNS behavior patterns based on dependency and position attribute. Then, the paper proposes a supervised classifier based DNS activity detecting algorithm DAOS (binary classifier for DNS activity observation system). Dependency attribute is used to describe external usage of the domain name from perspective of DNS customer, while position attribute is used to describe resource allocation of records in the zone file. Experimental results show that the algorithm, with a DNS data source in 2 hours, can achieve 90.5% of accuracy, 2.9% of false positive, and 6.6% of false negative without prior knowledge. If the observation is kept for a week, accuracy rises up to 93.9%, false positive and false negative can descend to 1.3% and 4.8%.
MA Jin-Xin , LI Zhou-Jun , ZHANG Tao , SHEN Dong , ZHANG Zhang-Kai
2017, 28(9):2388-2401. DOI: 10.13328/j.cnki.jos.005179
Abstract:Taint analysis method in binary code plays an important role in reverse engineering, malicious code detecting and vulnerabilities analysis. Currently, most of taint analysis methods fail to operate float point instruction, and they do not propagate taints accurately and efficiently enough. In the paper, a taint analysis method is implemented based on offline indices of instruction trace, which are byte-grained and utilize taint tags. A generation and query algorithm of offline indices is also presented. Instructions unrelated to taint data are skipped with offline indices, which improves the efficiency of taint analysis. The taint loss problem resulted from real time translation is described and solved for the first time. Taint tags are utilized to denote where the taint data is derived. A more complete taint propagation algorithm, which could operate float point instructions and insure the taint data flow from source operands into the destination operands precisely, is also presented. Flexible user-configuration mechanism is implemented to produce taint data on the fly with black list. The proposed method is applied in vulnerabilities detecting and evaluated with 12 vulnerabilities as test cases. The experimental result shows that this taint analysis method is able to detect more vulnerabilities than TEMU, and is 5 times faster in average.
ZHANG Ce , MENG Fan-Chao , KAO Yong-Gui , LÜ Wei-Gong , LIU Hong-Wei , WAN Kun , JIANG Jia-Nan , CUI Gang , LIU Zi-He
2017, 28(9):2402-2430. DOI: 10.13328/j.cnki.jos.005306
Abstract:SRGM (software reliability and growth model), as an important mathematical tool of modeling reliability and improving reliability process, plays significant role in measuring, predicting and ensuring reliability, managing testing resources and releasing optimal software. Research on SRGM is elaborated and analyzed in this paper. First, main research content and modeling process of SRGM are analyzed, and the basic function is sketched. In the mean time, research evolution is summarized, the state of art is illustrated and the current research characteristics are formulated. Second, from the three aspects including the total number of faults in software, FDR (fault detection rate) and TE (testing-effort), the key factors influencing SRGM are analyzed. Based on the unified framework model proposed in author's previous research, the classical numerical models are classified, compared and analyzed. In addition, the SRGMs based on finite and infinite queue model are discussed and simulation technique emphasizing on RDEP (rate-driven event processes) is elaborated. Furthermore, to evaluate the differences in the models, 26 models are compared by 16 published failure data sets. Experimental results reveal that the differences depend on the objectivity of failure data set collected and the subjectivity of establishing mathematical model by researchers under the different assumptions. Finally, the challenges, the trend of development and the problems to be solved are pointed out.
ZHAO Jing-Sheng , ZHU Qiao-Ming , ZHOU Guo-Dong , ZHANG Li
2017, 28(9):2431-2449. DOI: 10.13328/j.cnki.jos.005301
Abstract:Automatic keyword extraction is to extract topical and important words or phrases form document or document set. It is a basic and necessary work in text mining tasks such as text retrieval and text summarization. This paper discusses the connotation of keyword extraction and automatic keyword extraction. In the light of linguistics, cognitive science, complexity science, psychology and social science, this paper studies the theoretical basis of automatic keyword extraction. From macro, meso and micro perspectives, the development, techniques and methods of automatic keyword extraction are reviewed and analyzed. This paper summarizes the current key technologies and research progress of automatic keyword extraction methods, including statistical methods, topic based methods, and network based methods. The evaluation approach of automatic keyword extraction is analyzed, and the challenges and trends of automatic keyword extraction are also predicted.
JIA Ruo-Yu , ZENG Ang , ZHU Min , LIU Han-Qing , LI Ming-Zhao
2017, 28(9):2450-2467. DOI: 10.13328/j.cnki.jos.005266
Abstract:Online transaction log is a set of commodity trading records generated by electronic commerce (E-commerce) platform. It incorporates information of the consumers, commodities, sellers and transactions that reflect consumer purchasing behavior. The existing visualization methods cannot fully combine the time series, hierarchical, geospatial and multi-dimensional features of online transaction log to perform multi-aspect analysis on consumer purchasing behavior. Combining with multiple features of online transaction log, this paper proposes a composite temporal visualization method based on the radial layout and a timeline visualization method incorporated with spatial information. An extreme color mapping method and an identifiable color mapping method are also designed to support the analysis. UPB-VIS is designed and implemented based on the methods above to realize the comprehensive analysis of consumer purchasing behavior. The usability of the system and the validity of the visualization methods are verified by using JD online transaction log.
WANG Zhong-Qing , LI Shou-Shan , ZHOU Guo-Dong
2017, 28(9):2468-2480. DOI: 10.13328/j.cnki.jos.005267
Abstract:Personal group information on social media is useful for understanding social structures. Existing studies mainly focus on detecting personal groups using explicit social information between users, but few pay attention on using implicit social information and textual information. In this paper, a latent factor graph model (LFGM) is proposed to recommend personal groups for each person with both explicit and implicit information from textual content and social context. Especially, while explicit textual and social contents can be easily extracted from user generated content and personal friendship information, a matrix factorization approach is applied to generate both implicit textual and social information. Evaluation on a large-scale dataset validates the effectiveness of the proposed approach.
BU Chao , WANG Xing-Wei , HUANG Min
2017, 28(9):2481-2501. DOI: 10.13328/j.cnki.jos.005299
Abstract:Currently, a lot of new types of applications are constantly emerging, and the user communication demands for different applications are also becoming diversified and personalized. To match users' frequent and changing communication demands, internet service provider (ISP) usually constantly purchases and operates new specialized network equipment, which leads to high operating cost and resource waste, and it is obviously unsustainable for network construction and development. This paper addresses the above challenge from the perspective of software-based method by reusing diverse routing functions. The suitable routing functions are selected to compose the customized routing services on communication paths of applications, in order to satisfy the user demands. Based on network function virtualization (NFV) and software defined networking (SDN), the paper proposes an adaptive routing service composition mechanism. It leverage software product line (SPL) to establish routing service product line, which serves as the basis to select routing functions and optimize routing services. In addition, based on machine learning, it establishes two-phased routing service learning model, that is, offline mode and online mode, by leveraging multilayer feed-forward neural network. It can constantly adjust and optimize routing function selection and service composition to achieve routing service customization and improve user service experience. Simulation and performance results show that the proposed model is feasible and efficient.
ZHAO Jing , TANG Yong , LI Sheng , LIU Xue-Hui , WANG Guo-Ping
2017, 28(9):2502-2523. DOI: 10.13328/j.cnki.jos.005305
Abstract:The constitutive model is the most important factor in the simulation of deformable solids. The stress-strain relationship of the existing basic constitutive models have some limitations and the deformation behaviors are relatively simple. In recent years, a quantity of researches have been placed on how to design more complex material models to satisfy the designers' requirements. In this paper, the material model is divided into three categories:the traditional homogeneous materials with same material parameters of the whole model, heterogeneous materials with composite structures, and the editing models of the material parameters and structures and the elastic behavior editing models based on the existing traditional constitutive models. Additionally, the current material design methods are reviewed, and recent researches, with their advantages and limitations, are analyzed. Finally, the current major challenges and future works are discussed.