TAO Jian-Wen , Fu-Lai CHUNG , WANG Shi-Tong , YAO Qi-Fu
2015, 26(5):977-1000. DOI: 10.13328/j.cnki.jos.004575 CSTR:
Abstract:Sparse representation has received an increasing amount of interest in pattern classification due to its robustness. In this paper, a domain adaptation learning (DAL) approach is explored based on a sparsity preserving model, which assumes that each data point can be sparsely reconstructed. The proposed robust DAL algorithm, called sparse label propagation domain adaptation learning (SLPDAL), propagates the labels from labeled points in the source domain to the unlabeled dataset in the target domain using those sparsely reconstructed objects with sufficient smoothness. SLPDAL consists of three steps. First, it finds an optimal kernel space in which all samples from both source and target domains can be embedded by minimizing the mean discrepancy between these two domains. Then, it computes the best kernel sparse reconstructed coefficients for each data point in the kernel space by using l1-norm minimization. Finally, it propagates the labels of source domain to the target domain by preserving the kernel sparse reconstructed coefficients. The paper also derives an easy way to extend SLPDAL to out-of-sample data and multiple kernel learning respectively. Promising experimental results have been obtained for several DAL problems such as face recognition, visual video detection and text classification tasks.
PENG Hong , JIANG Yang , WANG Jun , Mario J. PÉREZ-JIMÉNEZ
2015, 26(5):1001-1012. DOI: 10.13328/j.cnki.jos.004657 CSTR:
Abstract:Membrane computing, known as P systems or membrane systems, is a novel class of distributed and parallel computing models. This paper proposes a membrane clustering algorithm using hybrid evolutionary mechanisms to address data clustering problem. It uses a tissue P system consisting of three cells to find the optimal cluster centers for a data set to be clustered. Its object is used to express candidate cluster centers, and the three cells use three different evolutionary mechanisms: genetic operators, velocity-position model and differential evolution mechanism. Particularly, the velocity-position model and differential evolution mechanism used in the process are the improved versions proposed in this paper according to the special membrane structure and communication mechanism. The hybrid evolutionary mechanisms can enhance the diversity of objects in the system and improve the convergence performance. Under the control of the hybrid evolutionary mechanisms and communication mechanism, the membrane clustering algorithm can determine a good partition for a data set. The proposed membrane clustering algorithm is evaluated on three artificial data sets and five real-life data sets and compared with k-means and several evolutionary clustering algorithms. Statistical significance tests have been performed to establish the superiority of the proposed membrane clustering algorithm.
ZHENG Jin-Hua , SHEN Rui-Min , LI Mi-Qing , ZOU Juan
2015, 26(5):1013-1036. DOI: 10.13328/j.cnki.jos.004676 CSTR:
Abstract:Many-Objective optimization refers to optimizing the multi-objective optimization problems (MOPs) where the number of objectives is more than three. Most classical multi-objective evolutionary algorithms (MOEAs) use the Pareto dominance relation to guide the search and thus are hard to perform well in many-objective optimization problems. In this paper, a multi-objective evolutionary algorithm based on information separation (ISEA) is proposed. ISEA rotates the original coordinate system in the objective space, and makes the first axis parallel to the vector (1,1,…,1)T. The first member of the new coordinate is defined as convergence information, and the remaining members are defined as diversity information. Moreover, a neighborhood penalty mechanism based on layered selection is adopted using the information of the neighborhood shape made of two hyper-cones to maintain the diversity of individuals. The first hyper-cone is used to cover neighbors, and the second one to cover extreme individual whose convergence performs significantly worse than others. Additionally, after an individual is selected into the archive set, its neighbors are punished into an inferior layer. From comparative experiments with other representative MOEAs, including NNIA, e-MOEA, MSOPS, AR+DMO, and IBEA, the proposed algorithm is found to be successful in finding well-converged and well-distributed solution set.
XU Ge-Ni , LI Yong-Ming , GUAN Xue-Chong
2015, 26(5):1037-1047. DOI: 10.13328/j.cnki.jos.004686 CSTR:
Abstract:Firstly, the operations of extension and transition in soft set over the universe U and the parameter E are proposed and some related properties are derived. Secondly, it is proved that the quotient soft set with the operations of combination and focusing is a domain-free information algebra. Moreover, the algebra is also compact if the set of all possible parameters is finite. Finally, an algorithm of decision making in term of the model of domain-free information algebra is presented and a corresponding example is provided to show that this model can be successfully applied to many problems. A comparison between the proposed algorithm and Cagman's uni-int decision making is illustrated in the end.
QIAO Shao-Jie , JIN Kun , HAN Nan , TANG Chang-Jie , Gesangduoji , Louis Alberto GUTIERREZ
2015, 26(5):1048-1063. DOI: 10.13328/j.cnki.jos.004796 CSTR:
Abstract:For intelligent transportation systems, digital military battlefield and driver assistance systems, it is of great practical value to predict the trajectories of moving objects with uncertainty in a real-time, accurate and reliable fashion. Intelligent trajectory prediction can not only provide accurate location-based services, but also monitor and estimate traffic to suggest the best path, and as such becomes an active research direction. Aiming to overcome the drawbacks of the existing methods, a new trajectory prediction model based on Gaussian mixture models called GMTP is proposed. The new model contains the following essential phases: (1) modeling the complex motion patterns based on Gaussian mixture models, (2) calculating the probability distribution of different types of motion patterns by using Gaussian mixture model in order to partition trajectory data into distinct components, and (3) inferring the most possible trajectories of moving objects via Gaussian process regression. The GMTP algorithm is naturally a Gaussian nonlinear statistical probability model and the advantage of the proposed model is that the result is not only a predicted value, but also a whole distribution beyond the future trajectories, therefore making it possible to infer the location in regard to some motion patterns, e.g., uniformly accelerated motion, by using statistical probability distribution. Extensive experiments are conducted on real trajectory data sets and the results show that the prediction accuracy of the GMTP algorithm is improved by 22.2% and 23.8%, and the runtime can be reduced by 92.7% and 95.9% on average, respectively, when compared to the Gaussian process regression model and Kalman filter prediction algorithm with similar parameter setting.
ZHANG Jun-Bo , LI Tian-Rui , PAN Yi , LUO Chuan , TENG Fei
2015, 26(5):1064-1078. DOI: 10.13328/j.cnki.jos.004590 CSTR:
Abstract:The increasing complexity and dynamic change of massive data processing currently receive widespread attention. One of its core content is to study how to use the existing information to achieve rapid updating of knowledge. Granular computing (GrC), a new computing paradigm of information processing, is an emerging research field which is mainly used to describe and deal with uncertain, vague, incomplete and massive data, and provides a solution based on the granularity and the relationship between the granularities. As an important part of GrC, rough set theory is an effective mathematical tool to deal with the uncertainty and imprecise problems. Based on the MapReduce model in cloud computing, this paper first presents a parallel algorithm for computing the equivalence classes, decision classes and the association between them in rough set theory. A parallel algorithm is then designed for computing rough set approximations from large-scale data. To adapt to the dynamic real-time system, the MapReduce model and incremental method are combined to build two parallel incremental algorithms for updating rough set approximations in different incremental strategies. An extensive experimental evaluation on big data sets show that the proposed algorithms are very effective and have better performance with the increasing size of the data.
ZHANG Ji-Fu , LI Yong-Hong , QIN Xiao , XUN Ya-Ling
2015, 26(5):1079-1095. DOI: 10.13328/j.cnki.jos.004659 CSTR:
Abstract:In this paper, a related-subspace-based local outlier detection algorithm is proposed in MapReduce programming model for high-dimensional and massive data set. Firstly, the relevant subspace, which can effectively describe the local distribution of the various data sets, is redefined by using local sparseness of attribute dimensions. Secondly, a local outlier factor calculation formula in the relevant subspace is defined with probability density of local data sets. The formula can not only effectively reflect the outlierness of data object that does not obey the distribution of the local data set in relevant subspace, but also select N data objects with the greatest-outlierness as local outliers. Furthermore, a related-subspace-based local outlier detection algorithm is constructed by using LSH distributed strategy in MapReduce programming model. Finally, experimental results validate the effectiveness, scalability and extensibility of the presented algorithms by using artificial data and stellar spectral data as experimental data sets.
CHAI Xin , JIA Xiao-Fei , WU You-Xi , JIANG He , WU Xin-Dong
2015, 26(5):1096-1112. DOI: 10.13328/j.cnki.jos.004707 CSTR:
Abstract:Pattern matching with gap constraints is one of the key issues of sequential pattern mining. One-off condition which is always used in sequential pattern mining tasks means that the character of each position in the sequence can be used at most once for pattern matching. Recently, most researches focus on pattern matching with non-negative gaps, but the rule of non-negative gaps implies the character's order in the sequence may limit the flexibility of matching. For these reasons, this article proposes a strict pattern matching with general gaps and one-off condition and shows that this problem is NP-hard. To tackle this issue, a heuristic algorithm, named dynamically changing node property (DCNP), is designed based on nettree which dynamically updates the properties of each node such as the numbers of root paths, leaf paths and root-leaf paths, and thus can get a better occurrence. The above process is then iterated. To effectively improve the speed of DCNP and avoid dynamically updating information of nodes on a large scale, a checking mechanism is applied to allow DCNP update information of nodes only when the occurrence may have repetition. The space and time complexities of DCNP are also analyzed. Experimental results show that DCNP has better performance than other competitive algorithms.
YU Yan-Wei , WANG Huan , WANG Qin , ZHAO Jin-Dong
2015, 26(5):1113-1128. DOI: 10.13328/j.cnki.jos.004717 CSTR:
Abstract:This paper proposes a mining algorithm of density-based cluster-structure, named MCluStream, to resolve the problems of input parameter selection and overlapping cluster identification in evolving data stream. First, a tree topology index, named CR-Tree, is designed to map a pair of data points with directly core reachable into relationship of father and child node. The CR-Tree that record relationships among points represents cluster-structure under a series of subEps settings. Second, the online update of cluster-structure on CR-Tree is completed by MCluStream under sliding window environments, which effectively maintains clusters over massive evolving data streams. Third, a fast cluster-structure extraction method is implemented from the CR-Tree. Users can easily select reasonable clustering results according to the visualized cluster-structure. Finally, experimental evaluations on massive-scale real and synthetic data demonstrate the effective mining result and better performance of the proposed algorithm compared against state-of-the-art methods. MCluStream is desirable to be applied to self-adaptive density-based clustering over high-volume data streams.
WANG Yu-Ding , YANG Jia-Hai , XU Cong , LING Xiao , YANG Yang
2015, 26(5):1129-1150. DOI: 10.13328/j.cnki.jos.004820 CSTR:
Abstract:With the intensive and large scale development of cloud computing, security becomes one of the most important problems. As an important part of security domain, access control technique is used to limit users' capability and scope to access data and ensure the information resources not to be used and accessed illegally. This paper focuses on the state-of-the-art research of access control technology in cloud computing environment. First, it briefly introduces access control theory, and discusses the access control framework in cloud computing environment. Then, it thoroughly surveys the access control problems in cloud computing environment from three aspects including cloud access control model, cloud access control based on ABE (attribute-based encryption) cryptosystem, and multi-tenant and virtualization access control in cloud. In addition, it probes the best current practices of access control technologies within the major cloud service providers and open source cloud platforms. Finally, it summarizes the problems in the current research and prospects the development of future research.
ZHANG Yu , LIU Qing-Zhong , LI Tao , Wu Li-Hua , SHI Chun
2015, 26(5):1151-1172. DOI: 10.13328/j.cnki.jos.004821 CSTR:
Abstract:Cyber attacks and cybercrimes that run stealthy in computer memory make the traditional file system-based computer forensics tasks difficult to be carried out effectively, and thus become a serious security threat. As an important branch of computer forensics, memory forensics is an effective way to combat evasive cyber attacks and cybercrimes. It extracts the evidence of cyber attacks and cybercrimes by comprehensively obtaining and analyzing memory data left by attackers. In recent years, the memory forensics which has drawn sustained attention of the security community, and undergone rapid development with wide range of applications, plays an irreplaceable role in the network incident response and cybercrime investigations. This paper first reviews the origin and evolution processes of memory forensics research, followed by introduction of the key mechanism of operating system memory management. It then explores the memory data acquisition and analysis methods, and summarizes the latest memory forensics technology. The paper concludes with a discussion of the existing problems of current memory forensics, and outlook of the trends and further research directions of memory forensics.
CHEN Hu , WEI Shi-Min , ZHU Chang-Jie , YANG Yi
2015, 26(5):1173-1180. DOI: 10.13328/j.cnki.jos.004654 CSTR:
Abstract:Certificateless public key cryptography can solve the key escrow problem without any digital certificates to bind users and their public keys. Meanwhile, aggregate signature can efficiently lower the cost of computations and communications. Hence it is of interest to construct a certificateless aggregate signature scheme by taking advantages of the two methods. Though great progress has been made in this area, certificateless aggregate signature schemes available today cannot simultaneously achieve the objectives of being secure against both types of super adversaries and being efficient in operation. This paper puts forward a construction of certificateless aggregate signature scheme with stronger security by using pairings and introducing state information. The state information is used to hold partial information on a given hard problem in the random oracle model. The results show that the presented scheme, based on the infeasibility of the computational Diffie-Hellman (CDH) problem, is secure against both super adversaries. At the same time, the new scheme needs only four pairings during the processes of individual signature and verification for an aggregate signature by making good use of public information and the properties of bilinear maps. Furthermore, after knowing the same state information, a user in the scheme can perform individual signature operations in a non-interactive manner, which allows any users in the system to join dynamically for generating an aggregate signature. As a result, it can have practical applications in many-to-one communications.
HUANG Ru-Wei , GUI Xiao-Lin , CHEN Ning-Jiang , YAO Jing
2015, 26(5):1181-1195. DOI: 10.13328/j.cnki.jos.004656 CSTR:
Abstract:With the development of cloud computing, privacy has become the key problem of cloud security. While encryption is a well-established technology for protecting sensitive data, it makes effective data utilization a very challenging task. To solve the problem, this paper designs a randomized data structure—random tree (RT), and constructs an encryption scheme OPEART (order-preserving encryption algorithm based on RT). OPEART realizes the encryption of data by randomness, and supports relational calculations (>, <, >=, etc.) on encrypted data. Security analysis and performance evaluation show that OPEART is IND-DNCPA while achieving the goal of relational calculations on encrypted cloud data efficiently.
ZHANG Ming-Wu , WANG Chun-Zhi , YANG Bo , Tsuyoshi TAKAGI
2015, 26(5):1196-1212. DOI: 10.13328/j.cnki.jos.004693 CSTR:
Abstract:In the traditional cryptosystems, secret keys are perfectly hidden for any possible attackers and only the cryptographic algorithms and public parameters are public. However, in practical applications, the attacker can obtain partial information about the matched decryption key from the noise channels or by the side-channel attacks. This study proposes a leakage-resilient hierarchical wildcard pattern encryption in which a user is associated with a wildcard identity pattern. A secret key is derived for a vector of identity strings where entries can be left blank using a wildcard, and this key can then be used to derive keys for any pattern that replaces wildcards with concrete identities. The scheme supports the wildcard pattern key delegation, which is considered as a general extension of leakage-resilient hierarchical IBE (identity-based encryption) and HVE (hidden vector encryption). Moreover, the proposed scheme can tolerate partial key leakage, and the scheme is proven to be leakage-resilient and semantically secure in the standard model under the subgroup decision assumptions.
SONG Chuan-Ming , ZHAO Chang-Wei , LIU Dan , WANG Xiang-Hai
2015, 26(5):1213-1236. DOI: 10.13328/j.cnki.jos.004822 CSTR:
Abstract:Three-Dimensional (3D) multiscale geometrical analysis is the technological fundamental for the processing of digital visual media, such as images, videos, and geometrical models. Its objective is to efficiently represent the point singularity, curve singularity, as well as surface singularity presented in those visual media. This study first reviews the research advances in two-dimensional (2D) multiscale geometrical analysis. It then elaborates on the development of 3D multiscale geometrical analysis for video according to the capability evolution in capturing singularity and nonlinear approximation efficiency improvement of various transforms. State-of-the-Art 3D multiscale geometrical analysis is classified into three categories: the extended multiscale geometrical analysis from 2D basis functions, the multiscale geometrical analysis based on 3D basis function, and the multiscale geometrical analysis based on spatiotemporal non-local correlation. The basic ideas of typical transforms are thoroughly discussed subsequently, and so are their nonlinear approximation efficiency, computational complexity, advantages, and disadvantages. Meanwhile, this study also presents a general review on the development of the 3D multiscale geometrical analysis for geometrical models. Based on the study above, the development trend of the 3D multiscale geometrical analysis is forecast in the near future.
LENG Lu , LI Ming , Andrew Beng Jin TEOH , Cheonshik KIM
2015, 26(5):1237-1250. DOI: 10.13328/j.cnki.jos.004594 CSTR:
Abstract:Security and privacy of templates are critical to the practical applications of palmprint systems. However, some objectives of biometric protection are difficult to meet at the same time due to the conflicts among them. As a cancelable palmprint coding algorithm to resolve the aforementioned conflicts, PalmPhasor is efficient and secure for palmprint authentication. In this paper, a complete analytical framework is proposed to systematically analyze the performance of PalmPhasor. The scenarios are categorized into four cases for the convenience of especial analysis. Some preliminaries, including auxiliary theorems, the distribution character of real and imaginary parts of Gabor filtered palmprint images, are provided to support the corresponding analysis. The theoretical analysis based on statistics and experimental results both confirm that PalmPhasor enhanced by multi-orientation score level fusion satisfactorily meets the four objectives of cancelable biometric at the same time even when users' tokens are stolen.
SUN Xiao-Peng , LI Si-Hui , WANG Lu , HAN Feng , WEI Xiao-Peng
2015, 26(5):1251-1264. DOI: 10.13328/j.cnki.jos.004699 CSTR:
Abstract:Combining the convex and relaxations, and following the solution path of convex-concave problem, the path following algorithm exhibits an excellent accuracy on graph matching approximately. In this paper, the path following algorithm is employed to address the problem of ear matching. Firstly, the PCA method is used to construct the set of salient keypoints of 3D ear point cloud data. Then the neighborhood of each keypoint is fitted to a single-valued quadric surface on a tensor-product parameter domain to define the local shape feature on the surface as the similarity measures. Next, the keypoints are triangulated into 3D topological graph using Delaunay triangulation, and the global structure discrepancy on the graph is obtained. Finally, the overall similarity measure is marked as the linear interpolation combination of the graph structure discrepancy and the local shape feature discrepancy, and the path following algorithm is then used to address the optimal matching between two keypoint graphs. The experiments show that the presented method provides a better matching result in terms of efficiency and accuracy than other similar approaches.