CHEN Ning , CHEN An , ZHOU Long-xiang
Abstract:Clustering of transactions can find potential useful patterns to improve the product profit. In this paper, a two-step clustering algorithm——CATD is proposed, applicable in large transaction databases. First, the database is divided into partitions in which transactions are partially clustered into a number of subclusters. A hierarchical clustering algorithm is used to control the distance between these subclusters. In the global clustering, a k-medoids clustering algorithm is performed on the subclusters to get a set of k global clusters and identify noise. The algorithm is feasible for large databases because it only scans the original databases once and the clustering process can be performed in main memory due to the partitioning scheme and the support vector representative of subclusters.
TAN Yu-song , YANG Li , ZHOU Xing-ming
Abstract:This paper describes a new multidimensional data-space placement algorithm, SMDPA. The algorithm can place the data hyper-cubes efficiently even they have different accessed frequencies. The algorithm uses the data hyper-cubes' prior accessed frequency and their similarities to get the accessed relation between them. The simulation results make clear that the SMDPA algorithm has better performance than traditional algorithms.
Abstract:The inductive process is used to specify the evolution of algorithm in this paper. The relationship between a set of sentences of first-order language and an algorithm is established and inductive rules for heuristics are presented. A probabilistic approach to algorithm analysis is also presented. This approach provides a tool for design of efficient algorithm and automatic algorithm design.
WANG Xu , HUANG Tao , FENG Yu-lin
Abstract:This paper presents a small language for distributed computation with agent mobility——Scope language. The language is different from most existing work on distributed and mobile computation, which usually take some variants of π-calculus as their basis. The core of Scope language is a specially designed λ-like calculus with resources. It enables Scope language to directly model memory-like resources, instead of indirectly using process/channel as in π-calculus. Furthermore, Scope language gives a novel treatment to the notion of location, which is called Scope here. Scope and memory-like resources combined make Scope language complementary to most other work, and provide an alternative approach to modeling distributed and mobile systems, which feature the simplicity of implementation and the affinity with the programming model inherent in realistic language such as Obliq and Telescript.
Abstract:In this paper, two language characterizations for weak liveness (free-deadlock) and liveness of Petri nets are given. Some language properties of synchronous composed Petri nets are discussed. Based on Petri net language, a necessary and sufficient condition is given for live Petri net (bounded), and then the liveness presevation in a synchronous composed net is studied and a necessary and sufficient condition is obtained. Those results give a formal language method for net liveness testing and liveness controlling.
CAO Xian-bin , LI Jin-long , WANG Xu-fa
Abstract:A multiobjective optimization using genetic algorithm based on ecological cooperation (ECGA) is proposed in this paper after analyzing the present studies. In the algorithm, an ecological population density competition equation is used for reference to describe the relation between multiple objectives and to direct the adjustment over the relation at individual and population levels. Simulation experiments prove that the algorithm has a better performance in finding the Pareto solutions.
Abstract:Broadcast is a common operation in mobile ad hoc networks (MANETs). Many on-demand ad hoc routing protocols resort to it to discover the route between any two nodes. It is also an important means to disseminate information in many MANET applications. An intuitive way for broadcast is flooding. However, without well-designed control mechanisms, flooding will lead to serious message redundancy, contention and collision. This paper proposes an efficient broadcast scheme based on the concept of connected dominating set (CDS) in graph theory. The proposed scheme can reduce message redundancy significantly, while retaining the merits of flooding. Simulation results show that the proposed scheme outperforms a distributed CDS-based algorithm and a cluster-based approach.
Abstract:A new method for image-based rendering (IBR): rendering using plane slit image field is discussed in this paper. Slit image field is a set of slit images that can represent and generate all scenes in a given region. A concept of analogical slit image is presented. By using it, a method to reconstruct the slit image field as well as work-through scenes from limited samplings is presented. A slit image segment mapping technology to simplify the reconstruct process and reduce data is presented too.
Abstract:Software pipelining is an effective approach of loop scheduling. Pipelining of loops with conditional branches remains a challenge. Current algorithms are divided into four classes: loop linearizaton, path splitting, as-a-whole scheduling and path selecting. All of them fail to solve concurrently two conflicting problems: transfer time minimization and worst-case constraints. In this paper, a novel software pipelining framework is presented to do so. The key ideas are: (1) Path grouping, which splits paths into distinct groups based on their execution and transfer probability, so as to minimize transfer time; (2) Data dependence relaxation. Some data dependences may not have instances during execution when the loop has multiple paths. The ideal policy is to obey them only when they have instances. Thus the worst-case constraints are avoided. Preliminary experiments and qualitative analysis all suggest that the temporal benefit of the proposed approach is better than that of path splitting and as-a-whole scheduling.
CAO Xian-bin , LUO Wen-jian , WANG Xu-fa
Abstract:Individual evolution is based on its fitness in Genetic Algorithm, but its living environment and relationship with other parts aren't involved. In this paper, enlightened by the knowledge of ecological environment and population competition, a new co-evolution model is proposed, which is based on Ecological Population Competition Model. The experiment results show the high efficiency of the improved Genetic Algorithms based on this model in solving premature convergence and accelerating the convergence.
LIU Zhen-ying , FANG Bin-xing , HU Ming-zeng , ZHANG Yi
Abstract:Dynamic load balancing is an important factor to determine the performance of a NOWs (network of workstations). First, it is analyzed that the load movement is the reason for overhead. Furthermore, a formula of data movement granularity is proposed. Besides, a method to evaluate the benefits of load balancing is presented in order to balance only when it is profitable. Meanwhile this paper puts forward a load balancing algorithm. Finally, the execution result of the proposed method is compared with that of others and that without dynamic load balancing by experiments. The load balancing method in this paper outperforms that of Siegell's in the cases of idle workloads, different workload and data scales of applications.
SHAN Shi-guang , AO Wen , CHEN Xi-lin
Abstract:Facial feature extraction is an important aspect in facial image perception system. And it is also a prerequisite in animation system for generating a given person's 3D-face image. In this paper, a coarse-to-fine facial feature extraction strategy is presented based on facial texture distribution and deformable template, using the pre-result of a multi-level face detection, which aims at solving such problems as the searching highly depending on the initial parameters and time-consuming that deformable template algorithm often suffers from. In proposed strategy, firstly, the center of the two irises is localized making use of the valley and frequency characteristics in the two eye regions. Then integral projection is used to localize the coarse position of the mouth and the nose. Secondly, some key feature points about these organs are estimated. Finally, according to these feature points, good initial parameters for the pre-defined templates are given and an optimal algorithm based on greedy algorithm and multi-epoch cycle is used to search for the minimum solution. Experiments indicate that the implementation of the proposed strategy is with good performance in both speed and accuracy.
Abstract:In this paper, the geometric meaning of the transitive closure of a fuzzy relation in corresponding fuzzy graph is first given. An optimal algorithm, which is based on the computation of graph connected components, for fuzzy classification problem is proposed. For any given n samples, the worst case time complexity T(n) of the algorithm satisfies that O(n)≤T(n)≤O(n2). Compared with the classic fuzzy classification algorithm, which is based on the computation of the transitive closure of a given relative matrix and of the O(n3log n) time, the new algorithm decreases O(nlog n) time factor at least. The theoretic analysis and computer performance show that the real computing time of the new algorithm is acceptable when it is used for fuzzy classification on large data.
GUO Jian-sheng , ZHAO Yi , SHI Peng-fei
Abstract:Conceptual clustering analysis is suitable to discover the knowledge in database with incomplete or absent domain background information. It is difficult for original conceptual clustering method to deal with the data objects described by numerical attribute values. A new criterion function based on semantic distance is proposed in this paper, and a novel domain-based dynamic conceptual clustering algorithm (DDCA) is also presented. With the discretization of the continuous attribute values, it works well on the datasets that are described by mixed numerical attributes and categorical attributes. The algorithm automatically determines the number of clusters, modifies the demoid according to the frequency of the attribute values within each cluster and gives out the interpretations of the clustering with the conceptual complex expression. The experiments demonstrate that the semantic-based criterion function and the dynamic conceptual clustering algorithm are effective and efficient.
MO Can-lin , TAN Jian-rong , ZHANG Shu-you
Abstract:In this paper, a new method of generating fractal surface is presented by combining adaptive linear neural networks with the principle of generating free surface. The mathematical model of various fractal methods on free surface and the methods of controlling and adjusting the fractal surface shape according to different parameters of neural networks are put forward. Thus, the predictable, controllable and adjustable fractal surface can be generated with changing linear combination relation of all control points of the definite free surface.
WANG Shi , GAO Wen , HUANG Tie-jun , MA Ji-yong , LI Jin-tao
Abstract:There is a problem in online retail: the conflict between the different interests of all customers to different commodities and the commodity classification structure of Web site. This problem will make most customers access overabundant Web pages. To solve the problem, the Web page data, server data, and marketing data are mined to build a hidden Markov model. The authors use association rule discovery to get the large item set. Viterbi algorithm is used to find some paths that come from the root Web page to the Web page that the center of the large item set is in. This large item set is marked in the nodes that are in the paths. Through these steps, one can calculate all item sets and mark them in these paths. The large item sets will compete in the nodes for the limited space. Through this method the Web site will adjust itself to reduce the total access time of all users. This method can also be used in analysis of paths, advertisements, and reconstructing the Web site.
LU Jian-jiang , SONG Zi-lin , QIAN Zu-ping
Abstract:The issue of quantitative association rules in large databases is discussed in this paper. In order to soften partition boundary of the domain, the relational fuzzy c-means algorithm is adopted to determine two parameters of normal fuzzy numbers, then the normal fuzzy number model is adopted to partition the domain of the quantitative attributes and a series of linguistic value association rules are generated. The mining method of the linguistic value association rules is also provided. Because the abstract concepts can be well expressed with the linguistic values, the mined association rules are more abstract and easy to understand.
OU-YANG Wei-min , HENG Cheng , AI Qing-sheng
Abstract:Discovery of association rules is a very hot topic in data mining research, which has been found applicable and useful in many areas. In the current researches, all the items in a databases are treated in a uniform way. However, it is not true in the real world databases, in which different items usually have different importances. In order to represent the importance of individual items, the weight value for items is introduced, and a new problem of discovery of weighted association rules is put forward. Due to the introduction of weight for items, it is not sure that any subset of a frequent itemset is also frequent. Thus, a concept of k-support bound of itemsets is set forth, and an algorithm to discover weighted association rules is proposed.
WANG Ji-cheng , YANG Xiao-jiang , PAN Jin-gui , ZHANG Fu-yan
Abstract:A mass of heterogeneous, distributed and dynamic information on the Web has resulted in “information overload”. It's an important and urgent research issue to provide users with effective information retrieval service on the Web. Web search engines attempt to solve this problem, yet their effect is far from satisfying. In this paper, a distributed and cooperative strategy for Web information retrieval is proposed to substitute the centralized mode adopted by the current search engines. Then a new information retrieval system model (IRSM) is presented, which supports the retrieval of metadata about Web documents and uses Z39.50 standard protocol to unify the heterogeneous interfaces of different systems. Based on them, a distributed and cooperative information retrieval framework (DCIRF) is designed to help users search the Web effectively.
LIN Dan , LI Min-qiang , KOU Ji-song
Abstract:In trying to solve constrained optimization problems using genetic algorithms, the method to handle the constraints is the key factor to success. In this paper, some features of GA (genetic algorithms) and a large class of constrained optimization problems are taken into account and a new method called Fixed Proportion and Direct Comparison (FPDC) is proposed, which combines direct comparison method and the strategy to keep a fixed proportion of infeasible individuals. It has been successfully integrated with the ordinary GA. Numerical results show that it is a general, effective and robust method.