HE Xiao , MA Zhi-Yi , SHAO Wei-Zhong
Abstract:For graphical modeling languages, there are three problems on the notation definition: How to define graphical symbols for modeling elements; How to define the location relations between symbols; How to map the symbols and the location relations to the abstract syntax. For model transformation and code generation, the notation has to be represented as models. This paper proposes the notation definition metamodel (NDM) for metamodeling tools by summarizing and analyzing the notation of UML and UML family. For the three problems on notation definition, NDM is composed of three parts: basic figures and layouts, location relations and abstract syntax bridges. The notation model defined by NDM can be transformed to usable source codes. This paper also makes a comparison between NDM and other methods, and the results show that NDM have some advantages over other methods. NDM has been implemented in PKU MetaModeler, and some practices of NDM are introduced.
MA Xiao-Xing , CAO Chun , YU Ping , ZHOU Yu
Abstract:In this paper, software architectures and architecture styles are modeled with attributed typed graphs and graph grammars respectively. Accordingly, dynamic reconfigurations of software architectures are modeled with graph transformations. Based on such a modeling, a supporting environment is constructed twofold. Firstly, the visual manipulation of the graphical representation of software architectures is supported with a graph grammar- enabled editor. Secondly, the graphical architecture model is reified as a runtime software architecture object built into the physical running system, through which graph transformations of the architecture model is then naturally reflected as dynamic reconfigurations of the running system.
ZENG Xiao-Qin , HAN Xiu-Qing , ZOU Yang
Abstract:This paper proposes an edge-based context-sensitive graph grammar formalism with a concentration on solving the main graph grammar problem?embedding problem, and discusses the features of the proposed graph grammar and its parsing algorithm. Some comparisons of the proposed graph grammars with other existing grammars are given. Further researches on graph grammars are also reviewed.
Abstract:Visual language techniques have exhibited more advantages in describing various software artifacts than one-dimensional textual languages during software development, ranging from the requirement analysis and design to testing and maintenance, as diagrammatic and graphical notations have been well applied in modeling system. In addition to an intuitive appearance, graph grammars provide a well-established foundation for defining visual languages with the power of precise modeling and verification on computers. This paper discusses the issues and techniques for a formal foundation of visual languages, reviews related practical graphical environments, presents a spatial graph grammar formalism, and applies the spatial graph grammar to defining behavioral semantics of UML diagrams and developing a style-driven framework for software architecture design.
Abstract:Marching-Graph is a new visualization that integrates the graph metaphor and the spatial metaphor into a single visualization. It provides users with highly interactive maps for accessing the logical structures of information that has the geographical attributes. Instead of presenting known facts onto maps, it provides a mechanism for users to visually analyze and seek unknown knowledge through effective human-map interaction and navigation across different spaces. However, the traditional force-directed layout algorithms are very slow in reaching an equilibrium configuration of forces. They usually spend tens of seconds making the layout of a graph converge. Thus, those force-directed layout algorithms can not satisfy the requirement for drawing a sequence of graphs rapidly, while the users are quickly marching through the geographic regions. This paper proposes a fast convergence layout method that speeds up the interaction time while users are progressively exploring a sequence of graphs through a series of force-directed layouts in Marching-Graph. It essentially combines a radial tree drawing method and a force-directed graph drawing method to achieve the fast convergence of energy minimization.
HUANG Mao-Lin , NGUYEN Quang Vinh
Abstract:This paper proposes a new technique for visualizing large graphs of several ten thousands of vertices and edges. To achieve a graph abstraction, a hierarchical clustered graph is extracted from a general large graph based on the community structures discovered in the graph. An enclosure geometrical partitioning algorithm is then applied to achieving the space optimization. For graph drawing, it uses a combination of spring-embbeder and circular drawing algorithms that archives the goal of optimization of display space and aesthetical niceness. The paper also discusses an interaction mechanism accompanied with the layout solution. The interaction not only allows users to navigate hierarchically through the entire clustered graph, but also provides a way to navigate multiple clusters concurrently. Animation is also implemented to preserve user mental maps during the interaction.
REN Lei , WANG Wei-Xin , ZHOU Ming-Jun , TENG Dong-Xing , MA Cui-Xia , DAI Guo-Zhong , WANG Hong-An
Abstract:Wide-Spread deployment for interactive information visualization is difficult. Non-Specialist users need a general development method and a toolkit to support the generic data structures suited to tree, network and multi-dimensional data, special visualization techniques and interaction techniques, and well-known generic information tasks. This paper presents a model driven development method for interactive information visualization. First, an interactive information visualization interface model (IIVM) is proposed. Then, the development method for interactive information visualization based on IIVM is presented. The Daisy toolkit is introduced, which includes Daisy model builder, Daisy IIV generator and runtime framework with Daisy library. Finally, an application example is given. Experimental results show that Daisy can provide a general solution for development for interactive information visualization.
Abstract:Traditional visual data mining relies on visualization techniques to disclose implicit information and relationship among data through utilizing human capability of pattern recognition. As an important step in data clustering, noise removal is a challenging topic as domain-specific noise is not well defined and cannot be removed by generic process of data cleaning. This paper addresses two conjugated and reciprocal issues in the use of visualization in noise removal? choosing appropriate visualization techniques based on data removing methods, and designing processing algorithms that suit visualization. The goal is a synthesis of visualization techniques and data mining methods to enhance the overall performance while reducing the subjective factor in visual mining procedure. A visual data cleaning approach called CLEAN is proposed to assist spatial data clustering in four important aspects: removal of domain-specific noise, visualization of data quality, selection of algorithm parameters, and measurement of noise removing methods on parameter sensitiveness. Experiments show that the visualization models in CLEAN do assist effective discovery of natural spatial clusters in a noisy environment.
YANG Yu-Bin , LI Ning , ZHANG Yao
Abstract:Studies in social network theory focus on characterizing complex social relationships by firstly mapping and visualizing them into a graph, and then subsequently identifying the corresponding graph properties. This paper provides an integrated approach, which combines social network analysis and data mining theory with the necessary geographical attributes to analyze 1417 instances of terrorism that occurred world wide during the period 1980-2002. The study reveals interesting patterns on the evolution of these terrorist organizations over two decades. The proposed method can be easily generalized to be applied to other types of large-scale networked datasets, such as micro-array data, and genomic networked data, etc.
XIAO Jun , ZHUANG Yue-Ting , WU Fei
Abstract:In order to get primitive actions for animation production, this paper proposes a visualization and interactive segmentation method for 3D human motion sequences. Firstly, a new feature based on bone angles of human skeleton is used to represent the original motion data. Then a heuristic method is introduced to detect the candidate segmentation points on the feature sequence. Finally, a graphical user interface is created to visualize the candidate segmentation points and pick the precise segmentation points interactively. Experimental results show that this method can divide motion sequence into primitive actions more conveniently, efficiently and precisely.
ZHU Liang-Feng , PAN Xin , WU Xin-Cai , LIU Xiu-Guo
Abstract:A general principle, a practical method and a modeling process of 3D visual model for complex geological entities contained faults are discussed. A kind of 3D vector data model, which is based on boundary representation, suited to geological objects and their topological relationships, is developed in order to organize and describe the 3D structure model of complex geologic body. A modeling approach called the unified modeling technique for stratum and fault is presented, and some actualizing techniques are discussed. To deduce and simulate the fault plane, three methods are presented based on the basic properties, characters and qualities of geological faults reflect on fault data. For the construction of stratum surface, the restriction and the shielding action of fault plane to the interpolation process of stratum surface are considered, and the mesh generation method for multivalued surface of reverse fault is developed. TRICUT, a method to clip triangle meshes, is used to process surface cutting on stratum surface and fault plane. A concrete example of using those methods to 3D bedrock model in the Beijing Olympic Green District is presented and confirms the practical result of those techniques.
LI Yu-Kun , MENG Xiao-Feng , ZHANG Xiang-Yu
Abstract:This paper introduces the concept and characters of dataspace, and presents a framework for dataspace integration and management system. Based on the framework, this paper further summarizes research works on data model, integration, query, update, storage, index, evolution and systems of dataspace. Challenges and future work on dataspace research are analyzed.
WANG Hong-Zhi , LI Jian-Zhong , LUO Ji-Zhou
Abstract:Each element and value in a XML stream can be accessed only one time. In this paper, efficient algorithms are proposed for processing aggregation on XML streams. These algorithms efficiently support the processing of the aggregation queries with complex structures and the aggregation queries on XML stream with recursion structures. Theoretical analysis and experimental results show that the proposed algorithms are able to process aggregation queries effectively and efficiently on XML stream with high scalability.
TIAN Xuan , DU Xiao-Yong , LI Hai-Hua
Abstract:In semantic-based query expansion, computing term-concept association is a key step in finding associated concepts to describe the needed query. A method called K2CM (keyword to concept method) is proposed to compute the term-concept association. In K2CM, the attaching relationship among term, document and concept together with term-concept co-occurrence relationship are introduced to compute term-concept association. The attaching relationship derives from the fact that a term is attached to some concepts in annotated corpus, where a term is in some documents and the documents are labeled with some concepts. For term-concept co-occurrence relationship, it is enhanced by the text distance and the distribution feature of term-concept pair in corpus. Experimental results of semantic-based search on three different corpuses show that compared with classical methods, semantic-based query expansion on the basis of K2CM can improve search effectiveness.
ZHANG Jun-Qi , ZHOU Xiang-Dong , SHI Bai-Le
Abstract:In order to improve the query answering of high-dimensional database, data distribution is necessary to select appropriate indexing strategy. However, traditional data distribution models can not estimate the accurate data distribution in the complex real multimedia data of image and video. This paper presents a method to estimate the accurate data distribution based on query sampling, and proposes a novel hybrid index to speed up processing of high-dimensional K-nearest neighbor (KNN) queries. The proposed hybrid index improves the query efficiency by adaptively selecting different index strategies for the data with different distribution. In the first step, the cluster analysis and cluster splitting methods are applied to construct a tree-based index, and then the relationship between data distribution and index performance is derived by sampling. At last some tree branches with sparse data are extracted for linear scan, while the aggregate data remains in the tree. Extensive experiments on four real image data sets show that the proposed hybrid index structure performs better than iDistance, M-Tree and linear scan, and scales better with dimensions. The index is still faster than linear scan when the dimension reaches 336. The experiments also show that the proposed query sampling algorithm can obtain the accurate data distribution when the amount of sampling is below (N is the size of data set).
WU Shan-Shan , YU Ge , Lü Yan-Fei , GU Yu , LI Xiao-Jing
Abstract:Different from the traditional best-effort query service providing, the issue of deterministic QoS guarantee for data stream processing is discussed. Based on the theory of network calculus, an approach of QoS modeling and QoS guaranteeing for data stream processing is proposed. Before system running, the schedulability of all the queries with their QoS requirements satisfied simultaneously is verified. During run-time, service curves representing respective QoS requirements are allocated to each query admitted by QoS schedulability verification, in order to guarantee the expected QoS requirements. Moreover, QoS-guaranteeing batch scheduling and query sharing are extended to improve the query processing efficiency. Finally, experimental results show that the proposed approach offers deterministic QoS guarantee to continuous queries over data streams efficiently.
YANG Wei-Dong , WANG Qing-Ming , SHI Bai-Le
Abstract:This paper presents an approach for processing the buffer of XML stream systematically. In this approach, user interests are represented by XQueries, the recursive documents can be tackled, and the multiple queries can be processed simultaneously. This approach can determine the relationship between two nodes in results on the fly based on the binary code through runtime stack, which avoids the join among a large number of results and improves the system performance and memory usage efficiently.
LI Xiao-Long , LIN Ya-Ping , YI Ye-Qing , YU Jian-Ping , LU Xin-Guo
Abstract:Firstly a maximum similarity distribution model is proposed. Secondly, a new definition, named virtual coordinate of a node which composes of the minimum hop counts to several special nodes, is introduced in instead of absolute physical coordinate of a node. Based on the theories, a distributed location-independent node scheduling scheme is proposed. The scheme consists of a coverage algorithm and a connection algorithm. The coverage algorithm takes advantage of the nodes' virtual coordinate to divide all nodes into several subsets. Without using location information, this scheduling scheme not only has the ability that sensor nodes in subsets are more uniformly distributed in the target region than other schemes, but also guarantees that all subsets are connective. The simulation results show that the scheme outperforms the randomized node scheduling scheme, on the coverage rate and network lifetime, as well as the number of external nodes when maintaining subsets to be connective.
CAI Jia-Yong , QING Si-Han , LIU Wei , HE Jian-Bo
Abstract:A scheme on studying the safety issues for privilege in systems is introduced. Since the particular faculty of transiting security states makes analyzing and protecting privilege for a system difficult, techniques used in traditional access control should not copy to this field. For this reason the features are firstly inspected by discussing the origination of privilege using access control space theory. Then rules defined for a system could be divided into two categories: constraint rules and execution rules, describing the restrictions and effect of an authorization respectively. Furthermore, a special authorization relation between different privilege operations, as well as its properties, is investigated against rules' logical patterns by deduction. A quick algorithm for constructing authorization deduction graph is also provided. Basing on it, common safety issue of implicit authorization was reviewed with the possibility to be abused. Finally this paper formalizes the capability mechanism defined by POSIX (portable operating system interface) standard, constructing ADG (authorization deduction graph) for it. The design is revised with countermeasures against privilege abusing so as to preserve consistent with the principle of least privilege.
ZHOU Jing , WANG Yi-Jie , LI Si-Kun
Abstract:The features of simple description, small updates item and weak dependence are the main characteristics of updates of key-attributes in P2P systems. Accordingly, an optimistic data consistency maintenance method based on key-attributes is proposed. In the method, the update of key-attributes is separated from user update requests. Key-Updates are propagated by latency-overlay update propagation model, that is, updates are always propagated to the nodes having maximum or minimum latency, and assured and uncertain propagation paths of updates are all taken into account. Based on classifying key-update conflicts, a double-level reconciling mechanism including buffer preprocessing and update-log processing is applied to detect and reconcile conflicts, and then conflicts are solved by policies as last-writer-win and divide-and-rule. Lastly, update-log management method and maintenance method brought by node failure and network partitioning are discussed for the above is deployed based on the information storied in update-log. Delaying key-attributes updates cannot occur by the optimistic disposal method, and then it cannot depress efficiency of resource location based on key-attributes, which adapts well to P2P systems for Internet. The simulation results show that it is an effective optimistic data consistency maintenance method, achieving good consistency overhead, resource location and resource access overhead, and having strong robustness.
ZHOU Xin-Yun , HUANGFU Wei , SUN Li-Min , WANG Xian-Lei , LI Kai-Hui
Abstract:A model is proposed to analyze the MAC (medium access control) access delay of EDCA (enhanced distribution channel access) mechanism of wireless LANs (local area networks), which is based on the moment generating function and signal flow graph. The model gives the probability distribution function (PDF) of MAC access delay and several relative characteristic numbers, including the mean value, variance, standard deviation and deviation coefficient of MAC access delay. The results of the analytical model coincide with simulations, which validates the model proposed. A lot of analyses are presented. Firstly, the model presents the relationship among number of nodes and sending probability, collision probability, freezing probability and mean delay of all ACs (access categories). Secondly, the model discloses the composition of MAC access delay. Thirdly, the model presents the effect of different values of AIFS (arbitration inter frame spacing) on MAC access delay. Lastly, the model presents that the deviation coefficient of MAC access delay is larger than 1, so when the exponential distribution is adopted as the distribution of MAC access delay, the capacity of the queue system will be evaluated wrongly.
XIAO Li-Zhong , SHAO Zhi-Qing , MA Han-Hua , WANG Xiu-Ying , LIU Gang
Abstract:To address the issue in fuzzy C-means algorithm (FCM) that clustering number has to be pre-defined, a clustering algorithm, F-CMSVM (fuzzy C-means and support vector machine algorithm), is proposed for automatic clustering number determination. Above all, the data set is classified into two clusters by FCM. Then, support vector machine (SVM) with a fuzzy membership function is used to testify whether the data set can be classified further. Finally, the result of clusters can be obtained by repeating the computation process. Because affiliating matrix, obtained by the introduction of SVM into FCM, is defined to be the fuzzy membership function, each different input data sample can have different penalty value, and the separating hyper-plane is optimized. F-CMSVM is an unsupervised algorithm in which it is neither needed to label training data set nor specify clustering number. As shown from our simulation experiment over networks connection records from KDD CUP 1999 data set, F-CMSVM has efficient performance in clustering number optimization and intrusion detection.
HUANG Jian-Bin , JI Hong-Bing , SUN He-Li
Abstract:An improved sequence labeling model named Mixed Skip-Chain Conditional Random Field is presented to solve the problem of schema matching between semi-structured Web records and relational database. The proposed model can be trained on mixed samples set which consists of labeled samples and unlabeled relational database records to reduce the dependence on manually labeled training data. Moreover, it provides a novel way to incorporate the long-distance dependencies between different state variants. Experimental results using a large number of real-world data collected from diverse domains show that the proposed method can improve the performance of schema matching significantly.
Abstract:This paper proposes a multiple-recipient public key encryption, called pairing-based multi-recipient encryption (PBMRE). The proposed scheme is constructed on Weil pairing on elliptic curves and the Shamir's secrets sharing scheme. As a result, a private key for decryption can be converted to multiple users' private keys by secrets sharing, and reconstructed by the bilinear property of Weil Pairing in decryptions. Through an analysis, it is shown that this scheme is efficient and can effectively defend against deciphers' collaborating. Based on the Gap-BDH (gap-bilinear Diffie-Hellman) assumption and the random oracle model, a strict security proof is presented stating that the scheme has the indistinguishability under adaptive chosen ciphertext attack,简称 (IND-CCA2).