HE Yi-Chao , WANG Xi-Zhao , LIU Kun-Qi , WANG Yan-Qi
Abstract:To analyze the convergence of differential evolution (DE) and enhance its capability and stability, this paper first defines a differential operator (DO) as a random mapping from the solution space to the Cartesian product of solution space, and proves the asymptotic convergence of DE based on the random contraction mapping theorem in random functional analysis theory. Then, inspired by “quasi-physical personification algorithm”, this paper proposes an improved differential evolution with multi-strategy cooperating evolution (MEDE) is addressed based on the fact that each evolution strategy of DE has common peculiarity but different characteristics. Its asymptotic convergence is given with the definition of multi-strategy differential operator (MDO), and the connotative peculiarity of MEDE is analyzed. Compared with the original DE, DEfirDE and DEfirSPX, the simulation results on 5 classical benchmark functions show that MEDE has obvious advantages in the convergence rate, solution-quality and adaptability. It is suitable for solving complex high-dimension numeral optimization To analyze the convergence of differential evolution (DE) and enhance its capability and stability, this paper first defines a differential operator (DO) as a random mapping from the solution space to the Cartesian product of solution space, and proves the asymptotic convergence of DE based on the random contraction mapping theorem in random functional analysis theory. Then, inspired by “quasi-physical personification algorithm”, this paper proposes an improved differential evolution with multi-strategy cooperating evolution (MEDE) is addressed based on the fact that each evolution strategy of DE has common peculiarity but different characteristics. Its asymptotic convergence is given with the definition of multi-strategy differential operator (MDO), and the connotative peculiarity of MEDE is analyzed. Compared with the original DE, DEfirDE and DEfirSPX, the simulation results on 5 classical benchmark functions show that MEDE has obvious advantages in the convergence rate, solution-quality and adaptability. It is suitable for solving complex high-dimension numeral optimization problems.
FENG Qi-Long , WANG Jian-Xin , CHEN Jian-Er
Abstract:Packing problems form an important class of NP-hard problems. In order to solve the weighted 3-set packing problem, this paper converts the problem to the weighted 3-set packing augmentation problem, and mainly works on how to construct a maximum weighted (k+1)-packing based on a maximum weighted k-packing. This paper gives a theoretical study on the structure of the problem and presents a deterministic algorithm of time O*(10.63k) with color-coding, which significantly improves the previous best result O*(12.83k). After further analyzing the structure of the problem and based on the set dividing method, the above result can be further reduced to O*(7.563k).
YANG Zhi-Bin , PI Lei , HU Kai , GU Zong-Hua , MA Dian-Fu
Abstract:This paper firstly presents a summary of AADL (architecture analysis and design language), including its progress over the years and its modeling elements. Then, it surveys the research and practice of AADL from a model-based perspective, such as AADL modeling, AADL formal semantics, model transformation, verification and code generation. Finally, the potential research directions are discussed.
AO Li , SHU Ji-Wu , LI Ming-Qiang
Abstract:Data deduplication technologies can be divided into two categories: a) identical data detection techniques, and b) similar data detection and encoding techniques. This paper presents a systematic survey on these two categories of data deduplication technologies and analyzes their advantages and disadvantages. Besides, since data deduplication technologies can affect the reliability and performance of storage systems, this paper also surveys various kinds of technologies proposed to cope with these two aspects of problems. Based on the analysis of the current state of research on data deduplication technologies, this paper makes several conclusions as follows: a) How to mine data characteristic information in data deduplication has not been completely solved, and how to use data characteristic information to effectively eliminate duplicate data also needs further study; b) From the perspective of storage system design, it still needs further study how to introduce proper mechanisms to overcome the reliability limitations of data deduplication techniques and reduce the additional system overheads caused by data deduplication techniques.
LEI Bin , WANG Lin-Zhang , BU Lei , LI Xuan-Dong
Abstract:This paper defines robustness based on a formal semantics of component system, and proposes a testing framework to detect robustness problems. The approach is implemented in a tool named RoTesCo (robustness testing for components). It traverses the state machine of the component under testing to generate paths, which cover all the transitions. Firstly, the call sequences following the paths drive the component to different states. Secondly, it feeds method calls with invalid parameters or inopportune calls to the component. The test oracle is automated by distinguishing different types of exceptions. RoTesCo is evaluated on a benchmark of real-world components from widely-used open source projects and it has produced encouraging results.
XIE Jing-Yan , AN Jin-Xia , ZHU Ji-Hong
Abstract:Addressing at the problem that the considerations of imperfect debugging phenomenon in the existing software reliability growth model are limited, this paper proposes a software reliability growth model which takes into account of the imperfect debugging comprehensively. Both the fault introduction and the fault removal efficiency are considered in this model, and a time-dependent fault removal efficiency function is introduced. The goodness-of-fit and the predictive power of the new model are examined by using a public software failure data set. The results show that compared with other existing models, the proposed model fits the failure data better and can predict this set of data more accurately.
YU Hang , JIAO Li-Cheng , GONG Mao-Guo , YANG Dong-Dong
Abstract:This paper presents a clonal selection operation: clonal selection operation based on orthogonal experiment design (CSO-OED). This design is later combined with the typical clonal selection operation and results in two algorithms: CSO+CSO-OED(I) adopting parallel mechanism and CSO+ CSO-OED(II) adopting series mechanism. The validation in 9 classical benchmark functions and 6 complex functions has showed that CSO-OED can not only maintain the diversity of population, but also help avoid premature. Implemented in CSO+CSO-OED(I) and CSO+CSO-OED(II), the strategy that separates the local search and global search can not only guarantee the convergence but also improve the accuracy of global solution and the robustness of the algorithm.
HU Jun , WANG Guo-Yin , ZHANG Qing-Hua
Abstract:The extension of rough set is an important issue in rough set theory among which the covering based generalized rough set is vital. Concept approximation in covering approximation space (CAS) is a key issue for acquiring knowledge from it. Some researchers have done much on approximation of classical sets in covering approximation space. Some covering based generalized rough fuzzy set models have already been developed for approximation of fuzzy sets in covering approximation space. Unfortunately, there are limitations in these models. In this paper, a new covering based generalized rough fuzzy set model is proposed. It solves the problems of former models. Moreover, the lower and upper approximations in two different covering approximation spaces with partial order relation are studied, and the sufficient and necessary condition for generating the same covering based generalized rough fuzzy sets from two different covering approximation spaces is that these two coverings have the same reductions. In the end, the relationship of this new model with the models proposed by Wei and Xu is analyzed. Wei’s model and Xu’s model are proved to be two extremes of the new one, and they can be used in some special cases of unary covering. These results provide foundation for the application of covering based generalized rough fuzzy set models to fuzzy decisions.
CHEN Hao , CUI Du-Wu , CUI Ying-An , TAO Yong-Qin , LIANG Kun
Abstract:Enlightened by the conception of ethnic group in social science and a perspective to analyze the structure and evolutionary tendency of population in terms of ethnic group, a population structured technology, ethnic group mechanism, is proposed. Meanwhile, an ethnic group evolution algorithm (EGEA) with a dual track co-evolution process and special ethnic group operators is designed for binary coding. The simulation tests of the classical function and challenging composition test function show that the EGEA can restrain premature convergence effectively during the evolutionary process while improving the search efficiency greatly. The comparisons between EGEA and other typical algorithm show EGEA is a competent algorithm for solving numerical optimization problems.
TIAN Xiu-Xia , WANG Xiao-Ling , GAO Ming , ZHOU Ao-Ying
Abstract:This paper gives a summary of the secure and privacy preserving in database as a service (DaaS) from five primary techniques such as data confidentiality, data integrity, data completeness, query privacy preserving and access control policy. Data confidentiality is analyzed from the encrypted-based and division-based aspects; Data integrity and data completeness focus on the signature-based, challenge-response and probability-based aspects; Query privacy preserving and access control policy are analyzed mainly from exist problems. Finally, this paper gives the future research directions, existing problems and challenges of DaaS in the security and privacy preserving.
ZOU Zhao-Nian , GAO Hong , LI Jian-Zhong , ZHANG Shuo
Abstract:This paper investigates into the problem of mining evolving graphs, i.e. graphs changing over time. It focuses on mining evolving pattern set of connection subgraphs between given vertices in an evolving graph. A similarity function of connection subgraphs and the algorithm to compute it have been presented. Based on this similarity function, a dynamic programming algorithm with polynomial time complexity is proposed to find evolving pattern set. The experimental results on synthetic datasets show that the proposed algorithm has low error rate and high efficiency. The mining results on real datasets illustrate that the mining results have practical significance in real applications.
PAN Li-Qiang , LI Jian-Zhong , LUO Ji-Zhou
Abstract:Due to the limitation of wireless sensor networks in energy resources and the fact that part of Skyline query results can satisfy the users in many applications, this paper proposes an energy efficient approximate Skyline query processing algorithm to save the energy maximally according to the different requirements of applications. The proposed algorithm can compute an approximate Skyline result set only by making partial sensor nodes transmitting their sensing data back. And it is energy efficient because each sensor node transmits its sensing data back or not only depending on the information of itself, without the comparison with other data. Accordingly, communication cost is greatly reduced and network energy is greatly saved. Extensive experiments in simulation environment indicate that the proposed algorithm can process the approximate Skyline queries in wireless sensor networks energy efficiently according to different requirements of applications.
YANG Ning , TANG Chang-Jie , WANG Yue , CHEN Yu , ZHENG Jiao-Ling
Abstract:To solve the problem of clustering this paper proposes a concept of temporal density, which reveals a set of mathematical properties, especially the incremental computation. A clustering algorithm named TDCA (temporal density based clustering algorithm) with time complexity of O(c×m×lgm) is created with a tree structure implemented for both storage and retrieve efficiency. TDCA is capable of capturing the temporal features of a data stream with skew data distribution either in real time or on demand. The experimental results show that TDCA is functionable and scalable.
WU Hu , WANG Yong-Ji , WANG Zhe , WANG Xiu-Li , DU Shuan-Zhu
Abstract:This paper proposes a two-phase rating predicting framework that fuses co-clustering and non-negative matrix factorization method. First, it uses a novel co-clustering method (BlockClust) to divide the raw rating matrix into clusters much smaller than the original matrix. Then it employs weighted non-negative matrix factorization algorithm to predict the unknown ratings. In virtue of co-clustering preprocessing, this method achieves a higher predicting accuracy and efficiency on these low-dimensional and homogeneous sub-matrices. Moreover, it proposes three update schemes for the corresponding update scenarios in recommender systems. Finally, the proposed method is implemented together with seven types of related CF (collaborative filtering) methods. The comparisons show the efficiency of the proposed method and its potential in large real-time recommender systems.
CEN Rong-Wei , LIU Yi-Qun , ZHANG Min , RU Li-Yun , MA Shao-Ping
Abstract:Based on large scale click-through data, this paper study the interactive process between user and search engine, and derive user decision process. A comparative study between clicks on relevant results and non-relevant ones analyzes the reliability of individual user click-through behavior on search results. Three types of features are proposed and estimated for separating reliable user clicks from other ones. Experimental results show that the proposed method evaluates the reliability of user behaviors effectively based on click context features of Web search users.
XU Xiao , ZHANG Wei-Zhe , ZHANG Hong-Li , FANG Bin-Xing
Abstract:There are three core issues recognized for WAN-based distributed Web crawling systems: Web Partition, Agent collaboration and Agent deployment. Centering around these issues, this paper presents a comprehensive overview of the current strategies adopted by academic and business communities. The experiences, problems and challenges encountered by the WAN-based distributed Web crawlers are classified and discussed in depth. A summary of the current evaluation indicators is also given. Finally, conclusion and some suggestions for future research are put forward.
WANG Da-Ling , YU Ge , BAO Yu-Bin , ZHANG Mo , SHEN Zhou
Abstract:In this paper, based on the classification of search intentions, required information for every intention is further analyzed, and a dynamic model of generalizing Web pages based on the intentions is proposed, which dynamically creates a concept hierarchy concerning snippets, keywords, navigation types, and document formats for searched Web pages, and provides further retrieval navigations for different intentions by generalizing content, format and type of the pages, to return more relevant results to the intentions. Comparing with related work, this paper does not focus on users’ intentions themselves, but on the creation of generalization models based on users’ intentions and implementation for Web pages generalizing. Experimental results show that the generalization model can automatically acquire users’ search intentions under navigation, and provide relevant results and further navigation based on the intentions.
ZHANG Jin , WU Jiang-Xing , LIU Qin-Rang
Abstract:The Counting Bloom Filter (CBF) is a space-efficient data structure that extends a Bloom filter so as to allow approximate multiplicity queries on a dynamic multi-set. An in-depth study of three existing CBF schemes is presented, that is, the Na?ve Counting Bloom Filter (NCBF), the Space-Code Bloom Filter (SCBF) and the d-left Counting Bloom Filter (dlCBF). Then, a CBF scheme called Binary Shrinking d-left Counting Bloom Filter (BSdlCBF) is proposed. A performance metrics named load adaptability for CBF schemes is also defined. The performance of the four CBF schemes is evaluated by using metrics of counting error, space complexity and load adaptability under both uniform and Zipfian multiplicity distributions. The experimental results show that the proposed BSdlCBF outperforms the other three in terms of accuracy, space-efficiency and load adaptability. The cost of such an advantage of BSdlCBF is a reasonable rise in computational and space complexity.
WU Hua , GONG Jian , YANG Wang
Abstract:An algorithm based on double counter bloom filter for long flows identification (CCBF) is proposed in this paper. Double counter bloom filter structure is used to distinguish the process of the long flow filtration from the long flow existence. The false positive rate of the algorithm is analyzed. The relationship of the memory requirement and the error rate is analyzed through simulation. It is shown that with the same restriction of the memory resource, the average error of this algorithm is less than the existing similar algorithms. The analysis of the time performance shows this algorithm is capable of dealing with traffic up to 1 500kpps.The results reflect that this algorithm can be used to monitor the long flows on backbone network.
YANG Jun , ZHANG De-Yun , ZHANG Yun-Yi , WANG Yi
Abstract:A cluster-based data aggregation and transmission protocol (CDAT) for wireless sensor networks (WSNs) is proposed. CDAT achieves a good performance in terms of lifetime by a clustering method of balancing energy consumption and data prediction transmission strategy. In clustering phase, the initial probability of node for cluster head election is derived from mathematical relation between application’s seamless coverage ratio and numbers of required cluster heads, and residual energy and node degree are also employed to elect cluster head. In data aggregation phase, Cluster heads broadcast message for node joining and aggregate sampling data after clustering. According to the temporal correlation of sampling data, cluster heads send data to base station using prediction transmission strategy while satisfying transmission precision in the data transmission phase, and the lifetime of WSNs is prolonged with this strategy. Theoretical analysis and simulation results show that CDAT outperforms LEACH (low-energy adaptive clustering hierarchy) and PEGASIS (power-efficient gathering in sensor information systems) in terms of network lifetime by balancing energy consumption and decrease of transmission while satisfying desired Qos (quality of service) of application.
CHEN Jian-Zhong , XU Tian-Yin , LI Wen-Zhong , LU Sang-Lu , CHEN Dao-Xu , Edward CHAN
Abstract:This paper introduces a derivative tree-based P2P framework to support interactive streaming applications. The proposed scheme includes distributed discovery service for resource location and derivative tree-based caching structure for dissemination topology maintenance. By introducing the derivative tree into the system, the impact of dynamic node join/departure could be dramatically reduced. With the assistance of distributed hash table (DHT), the overhead of the resource searching, service reconstruction and overlay maintenance could be alleviated to an acceptable level. Extensive simulations show that the derived tree-based strategy performs well. The overhead of interactive operations in the scheme can be reduced by more than 50% compared to the existing P2P media streaming systems.
NI Xun-Bo , ZHAO De-Bin , GAO Wen , JIANG Feng , YAO Hong-Xun
Abstract:This paper proposes the combination of sign language linguistics with human kinematics to generate and detect the data of SISL (signer independent sign language) according to the characteristics of gesture sign language (GSL). An improved Mean-Shift algorithm is applied to the generation of hand shape data channels without losing the linguistic features of GSL, and then the key hand shape phonetic notation is used to detect the effectiveness of data. In order to enrich the kinematic characteristics of GSL, an improved genetic algorithm is applied to the generation of movement related data channels. Moreover, Labannotation is adopted to inspect the effectiveness of data. Finally, an experimental inspection framework is established based on an original sample to make the proposed detection method adapt to multi-classes data inspection of linguistics. Experimental results show that the proposed method for the generation and detection of SISL data is effective and feasible.
SUN Li-Bo , SUN Ji-Zhou , LIU Yan , NI Hui
Abstract:This paper presents a perception model based on the feedback control for autonomous virtual humans. In this model, the perception constraints of the organs are firstly maintained by the perceptual filters. Then, humans’ memory function is simulated since the circular queue is designed to store the relevant information of the perceived objects. Furthermore, according to the results of the perception and memory, all the parameters of the perceptual filter are controlled by feedback so that the effect of the perceived outcome on the process of the perception is demonstrated. The experimental results show that the perception model based on the feedback control not only reflects the characteristics of the human perception system well, but also provides more authentic environment information for autonomous virtual humans to control their behaviors and make decisions.