BAI Xiang , MAO Yu-Ming , LENG Su-Peng , MAO Jian-Bing , XIE Jun
2009, 20(4):780-794.
Abstract:This paper proposes an AIFS-based multi-class model to study the saturation channel throughput on the basis of the operation mechanism of IEEE 802.11e EDCA (enhanced distributed channel access) supporting servicedifferentiation. The proposed model is calculated numerically and validated against simulation results, a good matchbetween the analytical model and simulation is observed. In particular, this paper compares the proposed modelwith Xiao’s Markov chain model under the same configures. Based on the proposed model, this paper researches thequasi-optimal condition to avoid the computational complexity but still maintain the channel throughput close to itsoptimal value. Finally, the paper proposes an adaptive p-persistent MAC scheme, named DPS (dynamicparameter-tuning scheme), to assign appropriate different transmission probabilities (or contention window size) of different classes. Through DPS, it is feasible to provide service differentiation and achieve targeted throughput ratioamong different classes, at the same time to maximize the total channel throughput. Simulation and numericalresults show that DPS can effectively achieve the performance goal under a variety of network conditions, and achieve higher channel throughput than standard IEEE 802.11e EDCA in all different environments.
2009, 20(4):795-803.
Abstract:In an XML database, finding all occurrences of a twig pattern is a core operation for XML query processing. In the past few years, many algorithms, such as Holistic Twig and TJFast, were proposed in theliteratures. However, these algorithms are based on merging, with high computational cost. Recently Twig2Stackalgorithm and TwigList algorithm are proposed to resolve this problem, but they are very complex. Aim at this problem, this paper considers the characteristic that most path expressions have only a few output nodes, and proposes two new algorithms without merging, named TwigNM and TwigNME, which use only a few stacks.Finally, the experimental results show that these algorithms are superior to the previous algorithms, especially for only ancestor-descendant relationship in XPath.
GONG Mao-Guo , HAO Lin , JIAO Li-Cheng , WANG Xiao-Hua , SUN Yi-Fei
2009, 20(4):804-814.
Abstract:Based on the antibody clonal selection theory, an immune clonal data reduction algorithm is proposed for instance selection problems of data reduction. The theory of Markov chain proves that the new algorithm is convergent with probability 1. The experimental studies on seven standard data sets of UCI repository show that the algorithm proposed in this paper is effective. The best domain of the weight parameter λ is determined by analyzing its effect on algorithm’s performance. Furthermore, an encoding method based on the stratified strategy is introduced to accelerate the convergence speed when solving large scale data reduction problems. The experimental studies based on seven large scale data sets show that the improved method is superior to the primary one. Finally,the best domain of the number of stratums t is determined by analyzing its effect on algorithm’s performance based on the data sets Letter and DNA.
ZENG Xian-Hua , LUO Si-Wei , WANG Jiao , ZHAO Jia-Li
2009, 20(4):815-824.
Abstract:The conventional Laplacian Eigenmap preserves neighborhood relationships based on Euclidean distance, that is, the neighboring high-dimensional data points are mapped into neighboring points in the low-dimensional space. However, the selections of neighborhood may influence the global low-dimensional coordinates. In this paper, both the geodesic distance and generalized Gaussian function are incorporated into Laplacian eigenmap algorithm. At first, a generalized Gaussian Laplacian eigenmap algorithm based on geodesic distance (GGLE) is proposed. The global low-dimensional coordinates obtained by GGLE have different clustering properties when different generalized Gaussian functions are used to measure the similarity between the high-dimensional data points. Then, this paper utilizes these properties to further propose the ensemble-based discriminant algorithm of the above-motioned GGLE. The main advantages of the ensemble-based algorithm are: The neighborhood parameter K is fixed and to construct the neighborhood graph and geodesic distance matrix needs one time only. Finally, the recognition experimental results on wood texture dataset show that it is an efficient ensemble discriminant algorithm based on manifold.
WANG Hong-Jun , LI Zhi-Shu , CHENG Yang , ZHOU Peng , ZHOU Wei
2009, 20(4):825-833.
Abstract:Cluster ensemble becomes a research focus due to its success in privacy protection, distributing computing and reusing knowledge. Furthermore, the noise and isolation have little effect on the final result. Thereare two contributions in this paper. First, by regarding every base clustering as one attribute of the original data, it has found that the algorithm based on that is more extendable and flexible. Second, it designs a latent variable cluster ensemble (LVCE) model in this way and infers the algorithm of the model with Markov chain Monte Carlo (MCMC) approximation. At the end of the paper, the experimental results show that the MCMC algorithm of LVCE has a better result and can show the compactedness of data points clustering.
PENG Hong-Jing , CHEN Song-Can , ZHANG Dao-Qiang
2009, 20(4):834-844.
Abstract:A scheme of categorizing Laplacians is introduced in this paper based on the computation times of similarity weights for each pair of adjacent data points. It is also theoretically proven that the Laplacian construction with multiple computations of similarity weights for each pair of adjacent points can better capture the local intrinsic structure of data than those methods with only one or two such computations. A novel Laplacian construction method is then proposed, which is more suitable for natural image matting task. In this method, all the different similarity weights for any pair of adjacent pixels are reconstructed by using a local linear model in the neighborhoods they fall into. By combining the user-provided constraints which specify some pixels as foreground or background, a quadratic objective function for matting based on semi-supervised learning is formed. When estimating the colors of unknown pixels by sampling foreground and background colors, this optimization problem is reformulated and solved in an iterative manner. What’s more, this iterative scheme can also be successfully generalized and applied into other previously constructed Laplacians for image matting tasks with only sparse label scribbles. Both the theoretical analysis and experimental results validate that the proposed Laplacian construction approach can better capture the intrinsic structure between image pixels, and can propagate the finer ingredients of an image foreground and background rather than just their labels, and thus the mattes of higher quality are obtained.
MA Li-Bo , LI Xing , ZHANG Liang
2009, 20(4):845-857.
Abstract:Constructing an effective scan monitoring system is a necessary step for early detection and warning of unknown threats. Scan monitoring systems constructed by routable unused IP addresses will be more effective than those deployed in active networks for their special advantages in identifying threats precisely which results in low false alarm rate. Nowadays systematic researches on how to deploy such an effective monitoring system are still missing. This paper presents a novel scan monitoring model based on BGP route distribution to answer two practical deployment questions. One is how to design and deploy an ideal target-specified scan monitoring system and theother is how to evaluate the detecting effectiveness of actual limited deploying resources. On the basis of the model,this paper puts forward a new concept of deployment threshold which describes the most economical matchingvalue between the monitoring system’s scale and the scanner’s scanning width on the same detection probabilitydemand. According to the model and the deployment threshold, an effective monitoring system can be designed and appropriate detecting targets can be proposed which match the practical deploying resources to avoid blinddeployment as before. Simulation results are coincident with the theretical analyses.
WEN Zhong-Hua , HUANG Wei , LIU Ren-Ren , JIANG Yun-Fei
2009, 20(4):858-869.
Abstract:Planning by model checking is an approach to planning under uncertainty that deals with nondeterminism. Three ways which obtain hierarchical states for searching weak planning, strong planning, andstrong cyclic planning are respectively designed. Based on hierarchical states, some important conclusions on aweak solution, a strong solution, and a strong cyclic solution are obtained. What can be eliminated directly are allfound when a weak solution, a strong solution, and a strong cyclic solution are in turn searched. Therefore manystate-action pairs can be eliminated directly before starting planning. In fact, a way has been given which is basedon a search proceeding forwards from the initial states towards the goal states.
ZHOU Jun-Sheng , DAI Xin-Yu , CHEN Jia-Jun , QU Wei-Guang
2009, 20(4):870-877.
Abstract:Chinese chunking plays an important role in natural language processing. This paper presents a large margin method for Chinese chunking based on structural SVMs (support vector machines). First, a sequence labeling model and the formulation of the learning problem are introduced for Chinese chunking problem, and then the cutting plane algorithm is applied to efficiently approximate the optimal solution of the optimization problem.Finally, an improved F1 loss function is proposed to tackle Chinese chunking. The loss function can scale the F1loss value to the length of the sentence to adjust the margin accordingly, leading to more effective constraintinequalities. Experiments are conducted on UPENN Chinese Treebank-4 (CTB4), and the hamming loss function is compared with the improved F1 loss function. The experimental results show that the training algorithm with the improved F1 loss function can achieve higher performance than the Hamming loss function. The overall F1 score of Chinese chunking obtained with this approach is 91.61%, which is higher than the performance produced by the state-of-the-art machine learning models, such as CRFs (conditional random fields) and SVMs models.
2009, 20(4):878-889.
Abstract:Representation of the uncertain spatial information of the vague regions and handling the vague regionrelations are focus of research on the spatial database, geographical information systems and computer vision etc.To deal with the indeterminate membership of the vague points and the complex dynamic vague region relations,the static and dynamic vague region relations are discussed systematically based on vague sets. The vague region’sdefinition is formalized and the vague region partition for the vague region with or without kernel is also givenbased on the vague sets. The three basic vague region relations are studied and the dynamic basic vague regionrelations are also presented. The relation table for the vague regions with kernel and the implicative relations of the vague sub-regions are given. The dynamic vague region relations of the vague regions with kernel are studiedsystemically and the dynamic adjacent relation table is also given in detail. Furthermore, the analysis of an instanceis made. The production in this paper can deal with the indeterminate membership information of the vague pointsin vague regions, the dynamic vague region relations and forecast the next dynamic relations.
CHEN Hao , CUI Du-Wu , LI Xue , WEI Hong-Li
2009, 20(4):890-901.
Abstract:Based on the analysis of relationship between the crossover scale and reachable subspace of crossover operator, it can be found that the crossover scale is dynamically adjusted to the population structure. In this paper,three control mechanisms—the well-phased control strategy, the random distribution strategy and the adaptationevolution strategy are built up to adjust the crossover scale. The simulation tests of the classical function show theseoptimization mechanisms are available and valuable control knowledge of crossover scale for multi-dimensionfunctions have been generated by the adaptation evolution strategy. Furthermore, this research suggests a newmethod for the operator and parameter optimization of evolution algorithm.
2009, 20(4):902-909.
Abstract:Fine-Grained integrity check for forensic data becomes an important demand of computer forensics. It will mitigate the disasterous effect on the data by some random errors or the intentional forging modification.Unfortunately, the traditional method generates a hash for every piece of small data and produces a large amount ofhash data. These hash data are random data and can not be compressed in a normal way. It has a great negativeimpact on storing hash data and transmitting them over network. Based on the error correction coding theory, afine-grained integrity check method, an integrity indication code, is proposed. The properties of the integrity indication code are analyzed. Combinatorial codes for one error in a group of data objects are also proposed. Hash data can be compressed hundredfold using combinatorial codes. This paper provides a fundamental support for further research on fine-grain data integrity check method and related applications.
JI Cong-Rui , DENG Zhi-Hong , TANG Shi-Wei
2009, 20(4):910-917.
Abstract:As more and more data are expressed and stored in XML format, the study on XML keyword retrieval becomes the focus of IR (information retrieval) and Database. This paper gives and proves some properties of SLCA (smallest lowest common ancestor), which is the key concept of XML keyword retrieval. It also introduces anew XML keyword retrieval algorithm, Nearest Pair, on the basis of the properties above. This algorithm uses the iterative bi-search technology to look for nearest pairs, which can decrease the assistant computation by one order of magnitude. The experimental results show that Nearest Pair outperforms the existing mainstream algorithms in most cases.
WU Ai-Hua , WANG Xian-Sheng , TAN Zi-Jing , WANG Wei
2009, 20(4):918-929.
Abstract:Computing a repair for inconsistent XML documents is significant in applications. But getting an optimum repair is a NP complete problem, especially when XML documents violate both the function dependence and the key constraints. This paper proposes a cost-based heuristic algorithm, which can find a repair with the lowest cost in polynomial time. It first scans the original XML documents once to get the inconsistent data. Then it computes the general candidate repairs for each inconsistent data, and gets a whole document repair heuristicallybased on its cost. The experimental evaluation show that even when XML documents are large, with high percent of dirty elements, and against many different constraints, the algorithm can still run in less than O(n3) w.r.t. the size of inconsistent elements.
HE Yan-Xiang , CAO Qiang , LIU Tao , HAN Yi , XIONG Qi
2009, 20(4):930-941.
Abstract:LDoS (low-rate denial-of-service) attacks are stealthier and trickier than the traditional DDoS (distributed DoS) attacks. According to the characteristic of periodicity and short burst in LDoS flows, a detectionsystem DSBWA (detection system based on wavelet analysis) against LDoS attacks has been designed andimplemented based on feature extraction using wavelet transform. The proposed system, focusing on the number ofarriving packets at the monitoring node, extracts five feature indices of LDoS flows through wavelet multi-scaleanalysis of network traffic. Then a synthesis diagnosis is made by a trained BP neural network. Once the attack isverified, the information related to attackers can be obtained by locating malicious pulses. Simulation results in NS-2 show that the scheme DSBWA, capable of detecting the variants of LDoS attack, achieves high detection rate with low computation cost, and hence has good practical value.
XIONG Bin-Bin , LIN Chuang , REN Feng-Yuan
2009, 20(4):942-953.
Abstract:In general, the Wireless Sensor Networks (WSNs) are resource constrained, and with high Bit Error Rate (BER) links. Highly reliable transport protocol for this kind of network is challenging and costly in terms ofenergy and delay expenditure. On the other hand, many applications deployed on WSNs can live with some packetslosses provided that the loss rate is tolerable. Hence, the stochastic delivery transport protocols emerge as the applications and network constrains require. The stochastic delivery transport protocols carry out a profitable trade-off between the reliability and resource cost, thereby are adopted by many applications in WSNs. To analyzethe performance metrics of this kind of protocol in multi-hop WSNs, a Finite State Markov Chain (FSMC)-basedmodel is developed in this paper. By using this model the performance parameters of the protocols can be calculated directly, easily and comprehensibly. The effects of different network parameters (such as number the hops, the biterror rate of the wireless link) on the performance are investigated. To enhance the efficiency of stochastic deliveryprotocols, hop by hop acknowledgement scheme is introduced in some stochastic reliable transport protocols, and sodoes the broadcast character of the wireless channel. The analytical results show that the effects of these schemes on performance are diverse with the change of network parameter settings. Finally, the paper presents some advice for improving these protocols based on the analysis. Simulation results also demonstrate the effectiveness of these improvements.
ZHANG Ke-Wang , ZHANG De-Yun , JIANG Wei-Hua
2009, 20(4):954-964.
Abstract:Efficient usage of available capacity is critical for ad hoc network with limited bandwidth. Capacity iswasted due to exposed terminals in high-density ad hoc network. This paper proposes a new scheme, Packet SensingMedia Access with Collision Avoidance (PSMA/CA) to solve the exposed terminals problem in static ad hocnetwork, thereby to improve the spatial reuse of the medium and increase the network throughput. Neighbors’information within 2 hops plays an important role in PSMA/CA, and the protocol act like 802.11DCF in the absence of neighbors’ information. Nodes in the network get neighbors’ information within 1-hop and 2-hops via channellistening and neighbors information exchanging respectively. Having the 2-hops neighbors’ information, an exposedterminal captures a frame of ongoing dialogue and calculates the correlation coefficient between the primarydialogue and new dialogue to be built before trying to transmit. If the correlation coefficient is lower than the threshold required by SINR (signal to interference and noise ratio) for a certain bandwidth, the exposed terminal cancommunicate with its destination in parallel with the primary dialogue. PSMA/CA solves the exposed terminalsproblem without nodes synchronization and without additional requirements for hardware complexity. Simulationresults show that the average throughput of PSMA/CA outperforms 802.11DCF by 20% in high-density ad hocnetwork.
CHENG Wei-Fang , LIAO Xiang-Ke , SHEN Chang-Xiang
2009, 20(4):975-984.
Abstract:Unlike the traditional omni-directional sensors that always have an omni-angle of sensing range, directional sensors may be able to switch to several directions and each direction has a limited angle of sensingrange. This paper studies a novel “area coverage by directional sensors” problem. It proposes the MaximumDirectional Area Coverage (MDAC) to maximize the covered area by scheduling the working directions of thesensors in the network. This paper proves the MDAC to be NP-complete and proposes two distributed schedulingalgorithms for the MDAC. The proposed algorithms are proved to terminate in finite time. Simulation results demonstrate the effectiveness of the two algorithms.
ZENG Hai-Tao , WANG Yong-Ji , ZU Wei , CAI Jia-Yong , RUAN Li
2009, 20(4):985-996.
Abstract:Small Message Criterion (SMC) can measure the capability of the covert channel on transmitting small messages and is a necessary complement to the capacity criterion. However, SMC’s present definition hasdeficiencies. The acquirement of message length proved to be hard in the common information system. Mitigatingmechanism can not simultaneously satisfy the two restrictions of message transfer time and fidelity. The criteriondoes not cover information of message’s sensitivity. At first, the value function of message is introduced torepresent the danger of small message transmission. Based on the value function, a new definition of SMC ispresented where the threat tolerance standard of system is represented by a threshold of message value. The valuefunction also takes message’s sensitivity into account. A mechanism for secure real-time database scenario, whichchannel. Theoretical analysis and experimental results show that with the proposed new SMC, the secure system canperform comprehensive measurement and adjustable mitigation to the threat of covert channel.combines SMC with the channel capacity, is presented to measure and mitigate the threat of transaction covert
2009, 20(4):997-1013.
Abstract:The available bandwidth measurement is important for many Internet applications, such as network behavior analysis, quality of service (QoS) verification, and so on. Existing available bandwidth measurement toolsmainly measure the available bandwidth for a whole network path, and provide information about the tight link(s)other than vital links. Therefore, this paper presents a novel algorithm called LinkPPQ (trains of pairs of packet-quartets used to measure available bandwidth of arbitrary links), which uses trains of pairs of packet-quartets to measure the available bandwidth of any link along the path, and track the variety of the cross-traffic on it. This paper studies the performance of LinkPPQ in both simulation circumstances and the laboratorial network.Simulation results show that LinkPPQ can accurately measure the available bandwidth of each link on the paths thathave one narrow link or multiple narrow links, under different cross-traffic loads. Most measurement errors are under 30%, and the results are stable. The laboratorial experimental results show that LinkPPQ can accuratelymeasure the available bandwidth in the following situations: a) measuring a link with capacity 100Mbps from a link with capacity 10Mbps; b) monitoring the link with capacity that is ten times of the narrow link next to it on the path;c) estimating the available bandwidth of the narrow links on a path having multiple narrow links.
XIE Zhi-Jun , WANG Lei , CHEN Hong
2009, 20(4):1014-1022.
Abstract:This paper proposes a distributed multiscale data compress algorithm which can transform irregular sample data. Considering the characteristics and location information of nodes in sensor networks, a noveldistributed domain partition mode DDPM (distributed domain partition model) is proposed first. On the basis of thismodel, a multiscale data compress model—MDCM (multiscale data compress model) is proposed for sensornetworks. MDCM uses Voronoi tessellation partition the domain created by DDPM. Theoretical analyses and simulation results show that the novel methods above have good ability of approximation, and can compress the data efficiently, reduce the amount of data greatly.
XIE Lei , CHEN Li-Jun , CHEN Dao-Xu , XIE Li
2009, 20(4):1023-1037.
Abstract:In this paper, a hierarchical clustering-based approximate scheme for in-network aggregation over sensor networks called CASA (clustering-based approximate scheme for data aggregation) is proposed, CASAachieves a good performance in terms of lifetime by minimizing overall energy consumption for communicationsand balancing energy load among all nodes, while maintaining user-specified quality of data requirement. CASA adopts an optimal parameter for the cluster size, which can well handle the hierarchical approximate work for in-network aggregation with the minimized communication cost. Furthermore, an adaptive tolerance-reallocating scheme is leveraged to further reduce the overall communication cost and maintain a load balance according to various data change rates over deployment regions. Experimental results indicate that significant benefits can be achieved through this CASA approximate scheme.
JIA Jie , CHEN Jian , CHANG Gui-Ran , WEN Ying-You
2009, 20(4):1038-1047.
Abstract:The reasonable deployment of sensor nodes while guaranteeing the secure connection is one of the most important challenges in designing wireless sensor network. Traditional algorithms merely aim at network coverage rate, which leads to the reduction of the secure connectivity degree. In this paper, the model of sensor nodes deployment is theoretically analyzed. Combined with rapid multi-objective optimization of the capacity of elitism non-dominated sorting genetic algorithm, an optimal sensor deployment algorithm based on secure connection is proposed, to guarantee the effect of network tracking and secure communication. The performance of algorithms under different deployment model is analyzed. Simulation results demonstrate that the novel algorithm proposed in this paper can implement network coverage rate and secure connection degree more rapidly and efficiently and hence meets the actual demand in wireless sensor network.
LIU Wei , CAI Jia-Yong , HE Ye-Ping
2009, 20(4):1048-1057.
Abstract:Role-Based administrative models have been discussed for decentralized management in large RBAC (role-based access control) systems. The latest UARBAC model has significant advantages over other models. Dueto hierarchy relationships, administrative operations of UARBAC implicate permissions. By analyzing implicitauthorization, two flaws in definition and an implemental flaw in UARBAC are found, including being unable tocreate object, dangling reference and not supporting the least authorization. The paper corrects definitions ofadministrative operations for the former two. The least authorization in UARBAC is defined as the minimal rolematch problem. The paper proves the problem is NP-hard and gives a feasible algorithm based on greedy. The method will help the administrator use appropriate operations to achieve the least role assignment.
LIN Xin , LI Shan-Ping , YANG Zhao-Hui
2009, 20(4):1058-1068.
Abstract:k-Anonymity is an important solution to protecting privacy of queries in LBS (location-based service).However, it is pointed out in literatures that k-anonymity cannot protect privacy of continuous queries effectively. Acontinuous query issuing model is proposed, which incorporates a query issuing interval model and a consecutivequeries relationship model. Under this continuous query issuing model, two attacking algorithms are proposed forClique Cloaking and Non-clique Cloaking respectively. Then this paper argues that the cardinality of anonymity-setis not a good anonymity measurement under such attack and an entropy-based anonymity measurement AD(anonymity degree) is proposed. Experimental results demonstrate that the attacking algorithms have high successrate in identifying query senders when the consecutive queries have strong relationship, and that AD is a betteranonymity measurement than the cardinality of anonymity-set.
YANG Hao-Miao , SUN Shi-Xin , XU Ji-You
2009, 20(4):1069-1076.
Abstract:This paper proposes an efficient verifiably encrypted signature scheme without random oracles. The scheme is constructed from the recent Gentry signature and can be rigorously proven to be secure in the standardmodel. The scheme has several advantages over previous systems such as, shorter public keys, lower computationoverhead, and a tighter security reduction, therefore, it is a truly practical verifiably encrypted signature withoutrandom oracles, which can be used in online contract signing protocols. Additionally, the proof of our scheme,which depends on the Strong Diffie-Hellman assumption, may be of independent interest.
HUANG Yong-Qin , ZHU Ying , JU Peng-Jin , WU Zhi-Yong , CHEN Cheng
2009, 20(4):1077-1086.
Abstract:Today's microprocessor designs are becoming more and more complicated, so effective and sufficient verification is one of the key factors to the success of design tape-out. This paper firstly introduces some common theories and methods in microprocessor functional verification, and then introduces the verification strategies and various verification methods used to verify “ShenWei-1” high performance microprocessor. The RTL (register transfer level) verification is highly important for functional verification. Simulation-Based verification is the main method in “ShenWei-1” RTL verification, and this paper introduces how to use all kinds of verification technologies to solve the key problems of the RTL simulation-based verification: Generating high quality stimulus, checking simulation results quickly, reaching the target of verification coverage. At the end, this paper analyses the effects of all kinds of verification methods.