Abstract:In this paper, a multi-agent social evolutionary algorithm is proposed for multiobjective optimization problems. It completes the search process by the agent evolution. MOMASEA (multi-agent social evolutionary algorithm for multiobjective) defines the trust degree to denote the historical information of agents, and the neighborhood of agent is confirmed by it. According to the characteristic of multiobjective problems, three evolutionary operators are designed to complete the whole evolutionary process. The experimental results show that MOMASEA has a good convergence to the Pareto set. Furthermore, the analysis of the mode for instructs local environment verified that importing acquaintance net model can speed up the convergence effectively.
Abstract:#SAT problem is the extension of SAT problem. It involves counting models of a given set of proposition formulae. By using the inverse of resolution and the inclusion-exclusion principle to circumvent the problem of space complexity, this paper proposes a framework for both model counting and weighted model counting. It suggests a complementary method for current model counting methods. These methods are proved to be sound and complete. A model counting system, namely JLU-ERWMC, is built based on these methods.JLU-ERWMC outperforms the most efficient model counting methods in some cases.
Abstract:An adaptive scale updating algorithm based on boundary force is presented to improve the deficiency that the kernel-bandwidth of Mean-Shift is not changeable. Based on the analysis of weighted histogram of the target feature, this paper introduces a region likelihood to extract local information of the target. Then, by comparing the region likelihood in successive frames, it constructs a boundary force to locate the boundary points of the target model and updates the bandwidth of kernel-function adaptively. The experimental results show that the proposed method improves the effect of Mean-Shift when the size or shape of target changes and satisfies the real-time request.
Abstract:A framework of latent attribute space tree classifier (LAST) is proposed in this paper. LAST transforms data from the original attribute space into the latent attribute space, which is easier for data separation or more suitable for tree classifier, so that the decision boundary of the traditional decision tree can be extended and its generalization ability can be improved. This paper presents two SVD (singular value decomposition) oblique decision tree (SODT) algorithms based on the LAST framework. SODT first performs SVD on global and/or local data to construct orthogonal latent attribute space. Then, traditional decision tree or tree nodes are built in that space.Finally, SODT obtains the approximately optimal oblique decision tree of the original space. SODT can not only handle datasets with similar or different distribution between global and local data, but also can make full use of the structure information of the labelled and unlabelled data and produce the same classification results no matter how the observations are arranged. Besides, the time complexity of SODT is identical to that of the univariate decision tree. Experimental results show that compared with the traditional univariate decision tree algorithm C4.5 and the oblique decision tree algorithms OC1 and CART-LC, SODT gives higher classification accuracy, more stable decision tree size and comparable tree-construction time as C4.5, which is much less than that of OC1 and CART-LC.
Abstract:Lexical paraphrasing is the task of extracting word-level paraphrases. Lexical paraphrases should be context dependent since a word may have different paraphrases in distinct contexts. This paper investigates a framework for acquiring context-dependent lexical paraphrases, in which a web mining method is developed for extracting candidate paraphrases and a classification method is introduced in paraphrase validation. Evaluations are carried out on the People’s Daily corpus and the results show that: (1) the web mining method performs well in candidate paraphrase extraction, which extracts 2.3 correct paraphrases on average for each test word in each given context sentence; (2) the classifier for paraphrase validation is effective, which achieves an f-measure of 0.6023; (3) 75.11% and 98.31% of the paraphrases extracted by our method cannot be recognized by the two widely used context-independent methods, i.e., the thesaurus-based and clustering-based methods respectively. This indicates that the presented context-dependent method is a considerable supplement to the context-independent ones.
Abstract:This paper proposes a compression scheme which quickly compresses the raw data from multiple streams into a compressed synopsis. The synopsis allows to incrementally reconstruct the correlation coefficients without accessing the raw data. A modified k-means algorithm is developed to generate clustering results and dynamically adjust the number of clusters in real time so as to detect the evolving changes in the data streams.Finally, the framework is extended to support clustering on demand (COD), where a user can query for clustering results over an arbitrary time horizon. A theoretically sound time-segment partitioning scheme is developed so that any demand time horizon can be fulfilled by a combination of those time-segments. Experimental results on synthetic and real data sets show that the algorithm has higher clustering quality, speed and stability than other methods and can detect the evolving changes of the data streams in real time.
Abstract:As one of the efficient methods of improving time and space efficiencies, the full-text index technique has been well studied in recent years. According to the implementation ways, it can be classified into three categories: Index technique, hybrid technique of index and compression, self-index technique. This paper reviews the recent researches on this topic, which include the main techniques such as inverted files, signature files, suffix trees, suffix arrays, compressed indices based on the indices mentioned above, self-index based on inverted files, and self-index based on suffix arrays. This paper also introduces the basic theories, problems, progress as well as space and time efficiencies of these techniques. Through a detailed efficiency analysis and comparison, the pros and cons of the techniques are given. Finally, the important features of these techniques are summarized, and the future research strategies and trends directions are pointed out as well.
Abstract:Histograms can be used to estimate the selectivity of queries in query optimization. It is an unsolved problem using batched queries in compressed databases to construct an adaptive histogram to optimize query processing or answer queries approximately. This paper proposes to track hot data in compressed databases by scheduling these batched queries and use the feedback in query results to accelerate the convergence speed of the constructed adaptive histogram which can be maintained incrementally. A parametric method is also proposed to estimate the tuples falling in query area which is not covered by any bucket in the histogram. Experimental results show that the adaptive histogram has more average accuracy, higher convergence speed and better adaptability than STHoles.
Abstract:With the wide application of peer-to-peer (P2P) technologies in many fields such as E-commerce, it is increasingly necessary to do aggregation queries in P2P networks. However, due to the large scale and decentralization of P2P networks it is rather difficult to do this kind of operation. Aggregation queries will become even more difficult in case that the data in P2P networks are time-varying which is often occurs in practice. The existing aggregation methods for data in P2P networks all assume that the data are time-invariant. If these methods are directly applied to P2P networks with time-varying data, some problems will arise because the data used in aggregation processing would have changed owing to the long time of aggregation. So, this paper proposes an approximate aggregation method for time-varying data in P2P networks based on uniform sampling. The theoretical analysis and experimental results show that this aggregation method outperforms the existing methods and can effectively be applied to P2P networks with time-varying data.
Abstract:This paper proposes an adaptive structural index: AS-Index (adaptive structural index), which can avoid the problem of the existing indexes. AS-Index is based on F&B-Index. It consists of F&B-Index, Query-Table and Part-Table. Frequent queries are kept in Query-Table avoiding redundant operations in query processing. Based on Query-Table an efficient bottom-up query processing is also proposed for answering infrequent queries using the frequent queries in Query-Table. Part-Table is used for optimizing the queries with descendant edges. The existing adaptive structural indexes need to traverse the whole document for adaptation, and their adaptation granularity is XML element node. For AS-Index, the adaptation granularity is F&B-Index node which includes a set of XML element nodes, and its adaptation is an efficient and incremental process that supports branch queries. The experimental results demonstrate that this index significantly outperforms the previous structural indexes in terms of query processing and adaptation efficiencies. For large XML documents, compared with the existing adaptive structural indexes, AS-Index is more scalable
Abstract:Skyline query processing has recently received a lot of attention in database community. Lately, Akrivi Vlachou and D. Christos considered how to efficiently process subspace skyline queries in peer-to-peer networks,and proposed the concept of “extended skyline set” to reduce the volume of data transferred in the preprocessing phase for the first time. However, the experimental evaluation shows that this data structure is extremely limited in reducing the volume of data transferred in the preprocessing phase. Motivated by these facts, this paper proposes an efficient algorithm, i.e. TPAOSS (three-phase algorithm for optimizing skyline scalar), to reduce the volume of data transferred in the preprocessing phase. TPAOSS algorithm is based on the semantic relationship between full-space skylines and subspace skylines, and transfers the data through three phases. In the first phase, it only sends full-space skylines. In the second phase, it receives seed skylines. In the third phase, it exploits Bloom filter technology to obtain and send the replicated objects with seed skylines on the subspaces. Particularly, the paper presents two efficient strategies to reduce the volume of data transferred in the second phase. Furthermore, it presents detailed theoretical analyses and extensive experiments that demonstrate these algorithms are both efficient and effective.
Abstract:To reduce system delay and overall communication load, this paper proposes an efficient algorithm BOCS (based on the change of skyline) which is out of share-nothing strategy. BOCS is to solve this issue through progressive refinement by two steps. This paper provides analytical results and they show that BOCS is optimal in terms of the communication overhead among all algorithms which are out of share-nothing strategy. Theoretical analysis and extensive experiments demonstrate that these methods are efficient, stable and scalable.
Abstract:An algorithm of matrix-weighted association rule mining for query expansion is presented based on the quadruple pruning, and a related theorem and its proof are given. This method can tremendously nhance the mining efficiency. Experimental results demonstrate that its mining time is averagely reduced by 87.84%, compared to that of the original one. And a query expansion algorithm of pseudo relevance feedback is proposed based on matrix-weighted association rule mining, which combines the association rules mining technique with the query expansion. The algorithm can automatically mine those matrix-weighted association rules related to the original query in the top-ranked retrieved documents to construct an association rules-based database, and extract expansion terms related to the original query from the database for query expansion. At the same time, a new computing method for weights of expansion terms is given. It makes the weighted value of an expansion term more reasonable. Experimental results show that this method is better than traditional ones in average precision.
Abstract:This paper proposes a moving objects database model, dynamic transportation network based moving objects database (DTNMOD). Besides, a real-time traffic flow statistical analysis method based on moving object trajectories is provided. In DTNMOD, the underlying transportation network is modeled as a dynamic graph so that the state, topology, and traffic parameters of the transportation graph at any time instant can be tracked and queried.Moving objects are modeled as moving graph points which move only within predefined transportation networks.The data model is given as a collection of data types and operations which can be plugged as attribute types into a DBMS to obtain a complete data model and query language. To evaluate the performance of the proposed model,this paper has implemented a prototype system based on PostgreSQL and conducted a series of experiments. The experimental results show that DTNMOD provides good performances in processing region and join queries.
Abstract:Saving energy to prolong network life is a big challenge for WSNs (wireless sensor networks) research.In-Network query can reduce the number or size of packets through processing data in intermediate nodes so as to consume energy effectively. Present aggregation algorithms suppose all the sample data are correct. The existing outlier detection algorithms regard detection rate as the primary object and do not consider energy consumption and query characteristic. So the simple combination of the two aspects can not bring good performance. By analyzing the influence of faulty and outlier readings to aggregation results, this paper puts forward a robust aggregation algorithm RAA (robust aggregation algorithm). RAA improves traditional aggregation query using reading vector to judge whether a faulty or outlier has happened. RAA deletes faulty readings, aggregates normal readings and reports outliers. Thus, customers can know the networks condition clearly. Finally, this paper compares RAA and TAGVoting which uses tiny aggregation algorithm to complete aggregation and the Voting algorithm to realize outlier detection at the same time. Experimental results show that RAA outperforms TAGVoting in terms of both energy consumption and detection rate.
ZHANG Ke-Wang , ZHANG De-Yun , DU Jun
Abstract:Collision rate increases significantly with the appearance of hidden nodes in multi-hop wireless network, which results in unsuccessful transmissions and leads to performance degradation of the network. The RTS/CTS handshakes of 802.11DCF can not eliminate hidden nodes, and the situation becomes worse when the nodes are equipped with higher speed wireless devices, which require higher signal to interference and noise ratio (SINR) for successful reception. This paper analyzes the prevalence of hidden nodes in multi-hop wireless network and proposes an MAC protocol named double channel collision free media access protocol—DCCFMA to solve the hidden nodes problem. DCCFMA is a dual channel MAC (media access control) protocol, receiver adjusts the transmitting power of the control channel according to the signal power received from the transmitter so as to cover all hidden nodes around the receiver. Simulation results show that DCCFMA solves hidden nodes problem better than RTS/CTS (ready-to-send/clear-to-send) handshakes and achieves 24% additional network throughput as compared to that of 802.11 DCF.
Abstract:The reputation model is one of the most important methods that can be used to construct trust between peers in peer-to-peer systems. However, almost all reputation models for P2P applications are purely decentralized.They have many defects such as slow convergence speed of trust in node, complicated trust management and overwhelming network cost. So to solve these problems TLT (two-level trust), a two-level trust model, is proposed in this paper. In TLT a series of trust clusters are spontaneously formed that are the minimum unit of trust evaluation. Every trust cluster includes some members and a cluster header. There is a mutual trust relationship between the cluster header and member node. For example, in order to increase inter-cluster service trust the cluster header checks the service performance of members and eliminates malicious members by using the concept of intra-cluster service trust; while member nodes, aiming to heighten the service reputation and receive good quality ervices, also examine the management capability of the cluster header and isolate the malicious cluster headers by employing the concept of proxy trust. Analyses and simulations show that malicious behaviors can be quickly identified in TLT because of the fast convergence speed of trust value and TLT is scalable because of its simple trust management and small network overhead.
Abstract:A study on the effect of buffer control on user-level QoS of media flows is presented. In multimedia systems, playout buffer at the destination site is often adopted to compensate the delay jitter and improve the continuity of information playback. Buffer contol can reduce the effect of the delay jitter but increase the end-to-end delay. As delay and delay jitter are both the user-perceived QoS parameters, how does buffer control affect user-level QoS? By utilizing the former results on QoS mapping from application-level to user-level, and by investigating the relation between the buffer control parameter, end-to-end-level QoS parameters and application-level QoS parameters, the relationship between the buffer control parameter and user-level QoS parameter is obtained. The effect of buffer control on user-level QoS having been studied in-depth with theoretical analysis. The buffer size providing determinate delay and delay jitter guarantee is found and the buffer size providing the optimal user-level QoS for a certain network condition is demonstrated. Experimental results validate the analysis.
Abstract:A two-hop neighborhood information-based real-time routing design is proposed for wireless sensor networks. The approach of mapping packet deadline to a velocity is first adopted in SPEED routing. However, this routing decision is made based on the two-hop velocity. An efficient probabilistic drop is used to enhance efficiency while reducing deadline miss ratio when there is no forwarding candidate that can have the required velocity. If the deadline is not stringent, a cost function is embedded that can release the nodes frequently chosen to be the forwarder. An improvement of energy consumption balance is achieved across the network. The true characteristics of physical and MAC layers are captured in the simulation. A real lossy link model is drawn from experiments through Mica2 Motes. Simulation results show that compared with the existing SPEED-S that only utilizes one-hop information the proposed routing scheme has a lower deadline miss ratio and higher energy efficiency and it does not degrade the delay performance in general. The proposed design can be applied in real-time applications based on sensor networks which are more demanding in service quality.
Abstract:Topology mismatching between the overlay network and physical network is a main factor which affects the routing performance of structured P2P network. A detecting and decreasing links overlapped scheme(DDL) is proposed. It examines the overlapped physical links caused by overlay routing, and on the appropriate condition, sends the redirect messages to decrease the physical links crossed. DDL solves the topology mismatching problem on the physical network level, and it can be used in any structured P2P network without the limitation of overlay structures. According to different definitions of links overlapped, backward and forward DDL schemes are described in detail. Through performance analysis and simulation, DDL scheme can dramatically improve the topology consistency.
Abstract:Dynamic policy supporting and authorization granularity are two key issues in access control. Present researches only compared the expressiveness of policies, but never considered the policy’s structure and the granularity of authorization, which makes it difficult to support the dynamic policy and satisfy the least privilege requirement. As this paper points out that Lampson’s access matrix is the most fine-grained access control model,the other security policies need to group access matrix according to their different application requirements. By defining a descriptive framework of Groupability Basing on Security Labels (GroSeLa), generic security policies can be mapped into Lampson’s access matrix. GroSeLa framework consists of a set of fundamental components and an extension. The fundamental components give all policy’s structure for grouping matrix, and the extension reveals all necessary administrative requirements for supporting dynamic policy completely. Based on GroSeLa, this paper proposes five grouping dimensions for evaluating security policies, including grouping factors, dynamic factors, policy scale, authorization granularity and separation of duty supporting. The paper also compares four classicsecurity policies, namely ACL (access control list), BLP (Bell LaPadula), DTE (domain and type enforcement) andRBAC (role-based access control). To the best of these knowledge, it is studied that the difference onexpressiveness, usability and authorization granularity of different security policies are from the aspect of grouping access matrix.
Abstract:There exist a lot of randomness test methods, and most of them have parameters. As it is not practical to do all randomness tests in practice, it is important to study the relations among these methods. In this paper, four kinds of relations between randomness tests and a conception of “correlation degree of randomness tests” are defined firstly based on the statistics theory firstly. And the correlation degree is measured by means of the entropy method. Then, the relevancies between the four relations and the correlation degree are proved. And an algorithm of calculating correlation degree and a selection policy are provided as well. The work of this paper is helpful for selecting reasonable and scientific randomness tests and parameters. In addition, the randomness test methods adopted by NIST (National Institute of Standards and Technology) in AES (advanced encryption tandard) are explored by using the correlation degree and some dependence relations are found.
Abstract:In the current group signature schemes with backward unlinkability and verifier-local revocation (BU-VLR), the size of the public key is linear with the total number of time intervals, and the size of the revocation list (RL) is linear with the total number of time intervals and revoked members. Therefore, the cost is high not only in memory space but also in revocation token computation and revocation check. This paper proposes a BU-VLR group signature scheme under the DTDH assumption and the q-SDH assumption, which has short public key and RL to reduce the overheads in previous schemes. Moreover, it also has shorter signature length and smaller computation in signing.
Abstract:With the advance of computer hardware and software techniques and the continuous growth of application requirements, the computing systems, which have computers as their centers, have increasingly broadened the scope of application, while the complexity is also growing quickly. The demand for evaluating and improving the dependability of computing systems is more and more urgent. This paper gives the definition of computing systems’ dependability, and a series of quantitative indicators are presented to evaluate the dependability.At the same time, the threats to system dependability are classified and analyzed in details. Since the traditional methods are incapable of dealing with the diverse dependability problem faced by the increasingly complex systems,people are constantly searching for new techniques. In this context, the virtualization technique comes to its renaissance, and is rapidly becoming a major research focus in recent years. In this paper, the existing research results on applying virtualization to enhance the dependability of computing systems are summed up, and the main characteristics and mechanisms of virtualization on enhancing dependability are introduced. But due to the restrictions of the existing computing systems architecture, the superiority of virtualization can not work fully.Service Oriented Architecture (SOA) meets the requirements of virtualization well for the loosely coupled, platform independence characteristics. Therefore, at the last part of this paper, a framework which integrates SOA and virtualization is proposed to enhance the dependability of computing systems, which is called Service Oriented Virtualization (SOV). This paper analyzes how this system framework can enhance the system dependability by mechanisms of virtualization and architecture superiority, when faced with many kinds of dependability threats.