ZHANG Lu , MEI Hong , SUN Jia-Su
2006, 17(8):1661-1668.
Abstract:Program clustering for large and complex systems improves the effectiveness and efficiency of software maintenance and is a basis for acquiring reusable components. In this paper, a functional requirement based hierarchical agglomerative approach to this problem is proposed. In this approach, the semantic information existing in the descriptions of functional requirements is employed to acquire a high-level logic view of the given system. Furthermore, the corresponding source code artifacts for each requirement are acquired through the dynamic analysis. The requirements hierarchy and the requirement-artifact relationships are then used to recover the hierarchical organization of the source code. The clustering results of this approach have the mapping to the application domain. In addition, due to the dynamic analysis, the granularity of this approach is flexible.
WU Zhan-Chun , WANG Qing , LI Ming-Shu
2006, 17(8):1669-1680.
Abstract:CMM/CMMI (capability maturity model/CMM integration) has been accepted since 1999 by Chinese software organizations, and widely used since then. But there are limited number of CMM/CMMI users in China. By launching a survey, problems of using CMM/CMMI are identified, and the negative impact is analyzed. Based on the analysis, a PDCA(plan-do-check-action)-based software process control and improvement model is presented, and the SoftPM which is based on the model has been developed. The SoftPM has been widely used in China, which proves the effectiveness of the model to solve the problems identified in the survey. Software process effectiveness and efficiency are also improved for those software organizations that are using or will use CMM/CMMI.
ZHANG Yong-Qiang , SUN Sheng-Juan
2006, 17(8):1681-1687.
Abstract:In this paper, unascertained mathematical theory is applied to the study of software reliability modeling. It is used to study software fault process, describe software failure character and get software reliability parameters. Finally, a software reliability model based on unascertained theory is proposed in this paper. The new model changes the traditional modeling thought, and brakes away from the statistical distribution assumption about the variety of failure rate in the traditional software reliability modeling process. It has better applicability, which improves the incongruence of model application to some extent.
XUE Yun-Zhi , CHEN Wei , WANG Yong-Ji , ZHAO Chen , WANG Qing
2006, 17(8):1688-1697.
Abstract:Structural testing is one of the basic approaches for identifying test cases. Because of complexity of programming languages and variety of applications, an efficient approach to automated generation of structural test data is to breed search iteratively by profiling of program execution. Based on Messy GA, an automated approach for generating such data is proposed in this paper by solving F(X), a test coverage function of test data set. It utilizes Messy GA’s prominent feature that can optimize complicated problems without prior knowledge about schema arrangement in chromosomes, so that it can improve concurrency level of searching and test coverage. Compared with other approaches based on GA, the experimental results for several typical programs and real-world applications show that it can generate higher quality test data more efficiently, and can be applied to larger applications.
CHEN Xiu-Hong , HE Ke-Qing , HE Lu-Lu
2006, 17(8):1698-1706.
Abstract:UML2.0 standard has been adopted by the OMG for a long time. However due to the popularity of UML1.X in industry, a huge number of practical applications and models based on UML1.X still exist, and they are no longer an accurate description of the systems under the UML2.0. Today, many tools support the modeling with UML2.0, but none of them supports the transformation from UML1.X models to UML2.0 models. This paper compares the differences between the two versions from the top-level of metamodel, chooses a declarative and imperative hybrid framework, and presents a UML model transformation method based on Action Semantic. It also describes the transformation of the Interaction metamodel with Action Semantic Language, which proves the feasibility of the approach. This approach can reduce the repeated work of the user and realize the reuse of software models, and can also be applied to other metamodel or model transformations.
2006, 17(8):1707-1716.
Abstract:Currently, model driven software development processes largely take the form of the integration of several developing approaches. However comparing, selecting, composing of developing approaches usually rely on experiences lacking systemic guidelines. In this paper, a multi-dimensional separation of concerns approach for process frameworks constructing is proposed. Taking abstraction, generality, behaviorism as meta concerns, developing approaches are compared. Incorporated with the expected evolving curves of these meta concerns, implementing frameworks of developing processes are constructed. This work will be beneficial to meet nonfunctional requirements on model driven development processes like improving efficiency, traceability and ensuring consistency.
YU Min , LI Zhan-Huai , ZHANG Long-Bo
2006, 17(8):1717-1730.
Abstract:P2P (peer-to-peer) is the key technology of reconstructing the future distributed architecture and has a good application perspective. As the issues in P2P systems mostly come down to data placement and retrieval, P2P data management has recently become an active topic in database community. In this paper, the advantages of P2P systems are first described. Then the goals of P2P data management researches are presented. Thirdly, research of P2P data management is described from three facets, i.e. P2P information retrieval, P2P database-style queries and P2P continuous queries. Particularly, the index construction methods, semantic coordination, query semantics, query processing strategies, types of queries supported, and query optimization of P2P database-style queries are discussed in detail. Finally, the issues to be further studied are proposed.
CAO Cun-Gen , SUI Yue-Fei , SUN Yu , ZENG Qing-Tian
2006, 17(8):1731-1742.
Abstract:The representation of mathematical knowledge is an important aspect of knowledge representation. It is the foundation for knowledge-based automated theorem proving, mathematical knowledge retrieval and intelligent tutoring systems, etc. According to the problems that are encountered in designing the mathematical knowledge representation language in NKI (national knowledge infrastructure) and after the discussion of ontological assumptions for mathematical objects, two kinds of formalisms for the representation of mathematical knowledge are provided. One is a description logic in which the range of an attribute can be a formula in some logical language; and another is a first order logic in which an ontology represented by the description logic is a part of the logical language. In the former representation, if no restrictions are imposed on formulas, then there is no algorithm to realize the reasoning in the resulted knowledge base; in the latter representation, the reasoning in the ontology represented by the description logic is decidable, while in general, for mathematical knowledge described by the first order logic which contains the ontology represented by the description logic, there is no algorithm to realize its reasoning. Hence, in the representation of mathematical knowledge, it is necessary to distinguish conceptual knowledge (knowledge in an ontology) and non-conceptual knowledge (knowledge represented by a language containing the ontology). Frames and description logics can represent and reason effectively about conceptual knowledge, but the addition of non-conceptual knowledge to frames or knowledge bases may make the reasoning in the resulted knowledge bases not decidable and there is even no algorithm to reason about the knowledge bases. Therefore, it is suggested that in representing mathematical knowledge, frames or description logics are used todescribe conceptual knowledge, and the logical languages containing the knowledge base represented by the frames or description logics are used to represent non-conceptual knowledge.
LUO Ji-Zhou , LI Jian-Zhong , ZHAO Kai
2006, 17(8):1743-1752.
Abstract:Iceberg Cube is meaningful for OLAP (on-line analysis processing) and compression techniques play more and more important role in reducing the storage of data warehouse and improving the efficiency of data operations. It is really a problem to compute Iceberg Cube efficiently in the compressed data warehouse. The compression techniques of data warehouse are introduced concisely in this paper, and an algorithm to compute Iceberg Cube in compressed data warehouse by mapping-complete methods is proposed. Experimental results show that this algorithm outperforms the direct method that selects Iceberg Cube tuples from the complete computed cube.
CHEN An-Long , TANG Chang-Jie , YUAN Chang-An , PENG Jing , HU Jian-Jun
2006, 17(8):1753-1763.
Abstract:Mining asynchronous coincidence pattern is a difficult task in multi-data streams. The main contributions of this work included: (1) The filter technique of Haar Wavelet is investigated and applied to mining asynchronous coincidence pattern in multi-streams; (2) The Wavelet coefficient series are applied to the measurement of asynchronous coincidence between data streams. A series of theorems are proved to ensure the validity of measuring asynchronous coincidence; (3) The anti-noise increment algorithms are designed on loop sliding windows to mine asynchronous coincidence pattern and implemented with complexity O(n2); (4) The extensive experiments on real data are given to validate algorithms.
ZHANG Peng , TONG Yun-Hai , TANG Shi-Wei , YANG Dong-Qing , MA Xiu-Li
2006, 17(8):1764-1774.
Abstract:Privacy preservation is one of the most important topics in data mining. The purpose is to discover accurate patterns without precise access to the original data. In order to improve the privacy preservation and mining accuracy, an effective method for privacy preserving association rule mining is presented in this paper. First, a new data preprocessing approach, Randomized Response with Partial Hiding (RRPH) is proposed. In this approach, the two privacy preserving strategies, data perturbation and query restriction, are combined to transform and hide the original data. Then, a privacy preserving association rule mining algorithm based on RRPH is presented. As shown in the theoretical analysis and the experimental results, privacy preserving association rule mining based on RRPH can achieve significant improvements in terms of privacy, accuracy, efficiency, and applicability.
WANG Hong-Bo , WEI An-Ming , LIN Yu , CHENG Shi-Duan
2006, 17(8):1775-1784.
Abstract:Although NetFlow is widely deployed for traffic measurement, the sampling method of Netflow has shortcomings: it consumes excessive router resource during flooding attacks; selecting a suitable static sampling rate is difficult because no single rate gives the right tradeoff between resource consumption and accuracy for all traffic mixes. An easily-implemented packet sampling method is presented in this paper, which samples a fixed number of packets in the constant period with measurement buffer. The method can automatically adapt the sampling rate to traffic variety and provide the controllability of resource consumption. Theoretical analyses demonstrate that the new method can provide unbiased estimation with certain relative standard deviation bound. Experiments are also conducted with the real network traces. Results show that the proposed method can achieve simplicity, adaptability and controllability of resource consumption without sacrificing accuracy compared with other sampling methods.
2006, 17(8):1785-1795.
Abstract:A critical issue in wireless sensor networks is energy-efficiently disseminating data from nodes to multiple sinks. Many data dissemination techniques have been proposed for sensor networks. However these protocols are based on flooding mechanism which wastes much precious energy. Some recent protocols have been proposed to reduce the flooding cost at different levels but can not guarantee the query success rate. In this paper, an energy-efficient data dissemination scheme DCS (diameter-chord scheme) is proposed which can reduce the energy consumption and improve the success rate as well. DCS exploits the fact that any chord in circularity vertically intersects with a diameter. Further more, Two-Phase protocol is proposed based on this dissemination scheme. This protocol works in two phases, and only when it is not successful in the first phase, the second is triggered. Two solutions are also proposed to deal with the delay of the Two-Phase. Extensive simulations and analysis are conducted to evaluate these protocols. The results show that the proposed protocols perform better than the peers.
YANG Yi-Dong , SUN Zhi-Hui , ZHU Yu-Quan , YANG Ming , ZHANG Bo-Li
2006, 17(8):1796-1803.
Abstract:As an important task of data mining, outlier detection has been applied to many fields. Recently, research on mining in data stream is receiving more and more attention. For solving outlier detection in data stream, a new fast outlier detection algorithm is presented. Based on dynamically grid partitioning data space, the method separates dense areas from sparse areas. Data in dense areas are filtered simply, which reduces greatly the size of objects the algorithm should consider. Outliernesses of candidates in sparse areas are approximated efficiently. Data with high outlierness are outputted as outliers. Results of experiments on synthetic and real data sets show promising availabilities of the approaches.
YANG Qiu-Wei , HONG Fan , YANG Mu-Xiang , ZHU Xian
2006, 17(8):1804-1810.
Abstract:Systemic security strategy is described by security query in administrative model of role-based access control (RBAC). According to the definition of state-transition system, security analysis is defined and executed on Turing machine. Security query is classified by necessity and possibility. As a result, necessary security query and possible security query independent of status can be resolved in polynomial time, and the conditions under which possible security query is NP-complete problem are presented, but general possible security query is un-decidable.
WANG Xiao-Feng , ZHANG Jing , WANG Shang-Ping , ZHANG Ya-Ling , QIN Bo
2006, 17(8):1811-1817.
Abstract:As a new type of wireless mobile networks, Ad Hoc networks do not depend on any fixed infrastructure, and have no centralized control unit and so its computation capabilities are limited by mobile nodes. In this paper, a novel multi-party key agreement scheme with password authentication and sharing password evolvement for Ad Hoc networks is proposed based on ECC (elliptic curves cryptography). One of the functions of passwords is used as sharing information to authenticate the mobile node’s secret keys, and the other is used as a symmetrical key to encrypt alternating information between mobile nodes. The freshness and security of passwords are guaranteed by sharing password evolvement every time in mobile node’s secret keys authentication and key agreement. Consequently, the computational overheads and the store load of mobile nodes are lessened, moreover, secret keys authentication and information encryption between mobile nodes are provided. The new scheme enjoys many secure properties such as against man-in-the-middle attack, against replay attack, key independence, forward security, etc.
ZHU Peng-Fei , DAI Ying-Xia , BAO Xu-Hua
2006, 17(8):1818-1823.
Abstract:Distributed systems could be more secure with distributed trust model based on PKI (public-key infrastructure). The format of certificate may be different among different PKI systems. Those differences may disturb some applications performing verification of the certificate chain. In this paper, how those differences work during mutual verifications is analyzed with the new concept “certificate-format-compatibility”. Moreover, a new distributed trust model based on bridge CA (certificate authority) with high compatibility is designed out. Using this trust model, the mutual connections between entities in different trust domains would not be affected by the different certificate formats.