HAO Dan-Dan , ZOU Shi-Hong , CHENG Shi-Duan
Abstract:Considering the factor of system sum bandwidth, two heuristic dynamic spectrum assignment algorithms for open spectrum systems are proposed according to convergency and fairness based on existing ones, they are the fast convergency algorithm with maximum bandwidth (FCMB) and the heuristic fairness algorithm with maximum bandwidth (HFWB). The performance of FCMB and HFWB is compared with the one of the collaboration max-sum-bandwidth (CMSB) algorithm, the randomized distributed (RAND) algorithm and the theoretical max-bandwidth optimal (OPTL) algorithm in system sum bandwidth, fairness and convergency by simulations. Furthermore, the effect of the numbers of primary users, secondary users and channels as well as the radius variance of disturbance area on the performance of those two algorithms is studied. Taking the system sum bandwidth into consideration, experimental results show that FCMB and HFWB outperform other three algorithms in convergency and fairness respectively, particularly FCMB shows superior performance in convergency (more than 300% improved than CMSB while FCMB performs similarly in system throughput).
LI Jian , ZHANG Tao , XIE Zhi-Yi , ZHU Hong
Abstract:This paper introduces a crossing elimination model based on a node duplication method and the authors want to minimize the number of duplication. It is related with an artificial problem, called the maximum simple sharing problem. First it is proved to be NP-hard, then a simple greedy algorithm which achieves an approximation factor of 3. Next the maximum disjoint simple sharing problem is introduced, which is naturally a 2-approximation of the maximum simple sharing problem, and the paper shows that this problem can be solved optimally by reducing to the perfect matching problem in a series of carefully constructed graphs. At last, the approximation factor is further improved to 12/7 with a local search technique.
LIU Pei-Qiang , ZHU Da-Ming , XIE Qing-Song , FAN Hui , MA Shao-Han
Abstract:This paper proves the binary fingerprints clustering problem for 2 missing values per fingerprint is NP-Hard, and improves the Figueroa's heuristic algorithm. The new algorithm improves the implementation method for the original algorithm. Firstly, the linked list is used to store the sets of compatible vertices. The linked list can be produced by scanning the fingerprint vectors bit by bit. Thus the time complexity for producing the sets of compatible vertices is reduced from O(m·n·2p) to O(m·(n-p+1)·2p), and the the running time of finding a unique maximal clique or a maximal clique is improved from O(m·p·2p) to O(m·2p). The real testing displays that the improved algorithm takes 49% or lower space complexity of the original algorithm on the average for the computation of the same instance. It can use 20% time of the original algorithm for solving the same instance. Particularly, the new algorithm can almost always use not more than 11% time of the original algorithm to solve the instance with more than 6 missing values per fingerprint.
XU Dao-Yun , DENG Tian-Yan , ZHANG Qing-Shun
Abstract:A CNF formula F is linear if any distinct clauses in F contain at most one common variable.A CNF formula F is exact linear if any distinet clauses in F contain exactly one conlrnon vailable.All exact linear formulas are satisfiable[1],and for the class LCNF of linear formulas,the decision problem LSAT remains NP-complete.For the subclasses LCNF≥kof LCNF,in which formulas have only clauses of length at least k,the NP-completeness of the decision problem LSAT≥kis closely relevant to whether or not there exists an unsatisfiable formula in LCNF≥k,i.e.,the NP-eompletness of SAT for LCNF≥k(k≥3)is the question whether there exists an unsatisfiable formula in LCNF≥k.S.Porschen et al.have shown that both LCNF≥3and LCNF≥4contain unsatisfiable formulas by the constructions of hypergraphs and latin squares.It leaves the open question whether for each k≥5 there is an unsatisfiable formula in LCNF≥k.This paper presents a simple and general method to construct unsatisfiable formulas in k-LCNF for each k≥3 by the application of minimal unsatisfiable formulas to reductions for formulas.It is shown that for each k≥3 there exists a minimal unsatisfiable formula in k-LCNF.Therefore,the stronger result is shown that k-LSAT is NP-complete for k≥3.
SHI Xiao-Hua , WU Gan-Sha , JIN Mao-Zhong , LUEH Guei-Yuan , LIU Chao , WANG Lei
Abstract:Prior implementations of escape analysis (EA) make a closed-world/whole-program assumption: All methods that can possibly be executed during the program execution are known statically. The Java programming language has several language features, e.g., dynamic class loading, native invocation and reflection, which break the closed-world assumption. These open-world features raise a major concern on the practicability and applicability of the prior approaches to Java applications that use or rely on the features. This paper proposes and evaluates a new escape analysis framework which handles the Java open-world features. The framework also provides a mechanism that controls the analysis complexity to reduce the runtime overhead with acceptable precision. The result shows that the EA framework, which is implemented in Intel's Open Runtime Platform on X86, eliminates 70% and 94% of dynamic synchronized operations and improves the runtime performance 15.77% and 31.28%, for SPECjbb2000 and db of SPECjvm98 respectively.
XIAO Jun-Chao , WANG Qing , LI Ming-Shu , ZHANG Lei , LIU Da-Peng
Abstract:This paper presents an organizational entity capabilities based software process modeling method (OEC-SPM), aiming at the particularities of software process, it defined the organizational entity with certain capabilities as the core element and the basic unit in the modeling process—process-agent. Process-Agents produce concrete software development processes and production processes according to the goals, the knowledge, the experiences and the capabilities under the defined project goals and constraint environment, through the proactive as well as autonomous reasoning on behaviors to provide effective supports and proper decisions to software project development. The method, owing to its full consideration on capabilities of process executors accomplishing goals when the software processe is being constructed, makes the established processes have good predictability and satisfy the premises for processes' stable executions, resolving the problems of instability and uncontrollability of the software process radically.
XU Meng , CHEN Jun-Liang , PENG Yong , MEI Xiang
Abstract:This paper proposes an interactive and dynamic method—service relationship ontology-based Web service composition method (SROCM). SROCM analyzes the services relationship from three-aspects, presents a method for building the services relationship ontology (SRO). In the composition, SROCM mines the dynamic semantic of services to achieve the interests of users and gets recommend subsequence services by SRO. The semi-automatically method can get an optimized service matchmaking result, adapt for variety of requirements of both customers and service providers flexibly, and ensure efficiency while composing service. The preliminary experiments show that this method can provide excellent subsequent service according to user's interests, and it is feasible and effective in dynamic service composition.
ZHANG Yang , HE Pi-Lian , XIANG Wei , LI Mu
Abstract:This paper proposes an approach to spelling correction. It reranks the output of an existing spelling corrector, Aspell. A discriminative model (Ranking SVM) is employed to improve upon the initial ranking, using additional features as evidence. These features are derived from state-of-the-art techniques in spelling correction, including edit distance, letter-based n-gram, phonetic similarity and noisy channel model. This paper also presents a method to automatically extract training samples from the query log chain. The system outperforms the baseline Aspell greatly, as well as the previous models and several off-the-shelf systems (e.g. spelling corrector in Microsoft Word 2003). The experimental results based on query chain pairs are comparable to that based on manually-annotated pairs, with 32.2%/32.6% reduction in error rate, respectively.
Abstract:This paper systematically clarifies granular rough theory from three aspects for its motivation, theory and implementation. Three expectations that motivate granular rough theory are analyzed as follows: 1) to emphasize representative semantics of roughness, with explicitly encoded semantic contexts in underlying representation model; 2) to extend applicability of roughness to a wider range of information sources, with representation model designed to accommodate semi-structured data; 3) to describe a variety of application contexts of information structure, to adapt roughness methodology to disciplines driven by mereology, and to exhibit potentials of combining mereology and computer science in the sense of developing innovative interdisciplinary methodologies, with a pure mereological approach to roughness. From theoretic perspective, granular representation calculus is defined, which plays the role of common representation model for both ordinary information sources and roughness methodology. In terms of this model, corresponding to the notion of lower approximation, border region and upper approximation for roughness, Kernel granule, hull granule and corpus granule are constructed respectively. From pragmatic perspective, upon open source implementation of "Entity-Attribute-Value" model, a rapid prototyping method for granular rough theory is described to provide a test-bed for verification purpose and to apply the roughness methodology for analyzing clinical data more naturally. Significance of granular rough theory, some open problems and further research are summarized.
ZHANG Xue-Nong , JIANG Yun-Fei , CHEN Ai-Xiang , ZHANG Li-Cheng
Abstract:This paper investigates the decomposition of diagnosis problem and gives a theorem for decomposition and combination of the diagnosis. On the basis of the above work, an algorithm using gradual approach to decomposing the diagnosis problem is proposed. Besides, the correctness, completeness and complexity of the algorithm are proved in this paper. The experimental results indicate that the algorithm can apparently improve the effectiveness of diagnosing multi-output system. Comparing with the method of decomposition by assuming instantiations of some variables, the algorithm is more efficient and applies to more general diagnosis problems.
LI Yan-Hui , XU Bao-Wen , LU Jian-Jiang , KANG Da-Zhou
Abstract:This paper analyzes that the main difficulty in reasoning with general terminological axioms is that the membership degrees of fuzzy interpretations are not discrete, but continous in [0,1]. To remove this obstacle difficulty, this paper proposes a discretization method of fuzzy interpretation to translate membership degrees into discrete values in a finite set. Based on this discretization, it gives a discrete Tableau reasoning technique for FALCN reasoning problems with general terminological axioms, which consists of the definition of discrete Tableaus, a construction algorithm for discrete Tableaus and the proof of soundness, completeness and complexity of this algorithm.
LIAO Bei-Shui , HUANG Hua-Xin , GAO Ji
Abstract:In the context of autonomic computing, by taking advantage of the non-monotonic knowledge representation and reasoning mechanisms of defeasible logic, a flexible Agent model is proposed, which is capable of accepting the real-time rule modifications, flexibly handling the run-time rule conflicts, and providing efficient non-monotonic reasoning functions. The flexible agent is both autonomous and controllable, and is able to cooperate with other Agents via contracts in an open and dynamic environment.
FU Hui , LIU Xia-Bi , JIA Yun-De
Abstract:This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).
ZHU Jing-Bo , WANG Hui-Zhen , ZHANG Xi-Juan
Abstract:This paper analyzes confusion class phenomena existing in text classification procedure, and studies further confusion class discrimination techniques to improve the performance of text classification. In this paper, firstly a technique for confusion class recognition based on classification error distribution is proposed to recognize confusion class sets existing in the pre-defined taxonomy. To effectively discriminate confusion classes, this paper proposes an approach to feature selection based on discrimination capability in the procedure of which each candidate feature's discrimination capability for class pair is evaluated. At last, two-stage classifiers are used to integrate baseline classifier and confusion class classifiers, and in which the two output results from two stages are combined into the final output results. The confusion class classifiers in the second stage could be activated only when the output class of the input text assigned by baseline classifier in the first stage belongs to confusion classes, then the confusion class classifiers are used to discriminate the testing text again. In the comparison experiments, Newsgroup and 863 Chinese evaluation data collection are used to evaluate the effectiveness of the techniques proposed in this paper, respectively. Experimental results show that the methods could improve significantly the performance for single-label and multi-class classifier (SMC).
HU Qing-Hua , YU Da-Ren , XIE Zong-Xia
Abstract:To deal with numerical features, a neighborhood rough set model is proposed based on the definitions of ( neighborhood and neighborhood relations in metric spaces. Each object in the universe is assigned with a neighborhood subset, called neighborhood granule. The family of neighborhood granules forms a concept system to approximate an arbitrary subset in the universe with two unions of neighborhood granules: lower approximation and upper approximation. Thereby, the concepts of neighborhood information systems and neighborhood decision tables are introduced. The properties of the model are discussed. Furthermore, the dependency function is used to evaluate the significance of numerical attributes and a forward greedy numerical attribute reduction algorithm is constructed. Experimental results with UCI data sets show that the neighborhood model can select a few attributes but keep, even improve classification power.
HU Min , WU Geng-Feng , YANG Jing
Abstract:Aiming at the difficulty in good segments of the formula to be inherited in formula discovering using gene expression programming (GEP), this paper proposes an innovative immune formula discovering algorithm (IFDA), which is actually inspired by MHC (major histocompatibility complex) regulation principle of immune theory. In IFDA, the formula are mapped as a tree structure and transformed into both constant and variation section of antibody with a depth-first mechanism while its fragment is encoded into the MHC. By the feature of MHC regulation, IFDA mines the dataset to discover the proper formula very quickly. Many data are benchmarked for verifying the performance of IFDA in which all results from experiments show that the IFDA can really provide better performance than GEP in both convergence speed and formula complexity.
Abstract:In this paper, a algorithm named DE-Tri-training semi-supervised K-means is proposed, which could get a seeds set of larger scale and less noise. In detail, prior to using the seeds set to initialize cluster centroids, the training process of a semi-supervised classification approach named Tri-training is used to label unlabeled data and add them into the initial seeds set to enlarge the scale. Meanwhile, to improve the quality of the enlarged seeds set, a nearest neighbor rule based data editing technique named Depuration is introduced into Tri-training process to eliminate and correct the mislabeled noise data in the enlarged seeds. Experimental results show that the novel semi-supervised clustering algorithm could effectively improve the cluster centroids initialization and enhance clustering performance.
SUN Qin-Dong , HUANG Xin-Bo , WANG Qian
Abstract:The characteristics of multiple pattern matching in mixed Chinese and English text and the problem of the existing multiple pattern matching algorithms used for processing mixed Chinese and English text are analyzed. A theorem of multiple pattern matching in mixed Chinese and English text is discovered and proved. A novel multiple pattern matching algorithm based on the threaded trie tree is proposed, which expands the standard trie structure, constructs the hash trie matching machine with the codes of Chinese and English characters, and threads the trie tree according to the characteristic of patterns set. The proposed algorithm does not need complex hash operation, and the matching pointer does not need backdate during matching. Theoretic analysis and experimental results demonstrate that the proposed algorithm efficiently solves the space expansion problem, and process mixed Chinese and English text correctly and efficiently with lower time and space complexity.
Abstract:This paper firstly analyzes the technical difficulties in capacity estimation and optimization theory on wireless mesh network, and summarize the prospects in it. Based on the existing work in this area, a brief introduction to interference model and schedule model of capacity analysis problem is proposed. An optimization model is proposed based on the two models mentioned above. This paper reviews the mathematical models in capacity analysis, including programming model, information model, combinatorial optimization models and stochastic model. Evaluation metrics of capacity analysis model is proposed, and different models are evaluated by using this rule. At the end of this paper, future works of capacity analysis and optimization theory are introduced.
ZHUGE Jian-Wei , HAN Xin-Hui , ZHOU Yong-Lin , YE Zhi-Yuan , ZOU Wei
Abstract:Botnet is a novel attack strategy evolved from traditional malware forms; it provides the attackers stealthy, flexible and efficient one-to-many Command and Control mechanisms, which can be used to order an army of zombies to achieve the goals including information theft, launching distributed denial of service, and sending spam. Botnet has stepped into the expanding phase, and has been a serious threat to Internet security, especially in China mainland. In this paper, the evolution process, concept, functional structure and execution mechanism of botnet are presented, the Command and Control mechanisms and propagation model are discussed, and the latest techniques on botnet tracking, detection and prevention are reviewed. The developing trends of botnet and further topics in this area are also analyzed.
LI Fang-Min , XU Wen-Jun , LIU Xin-Hua
Abstract:After analyzing the power control mechanisms in wireless sensor networks, the design principles and the classification standards are presented in this paper. Then the fundamental mechanism of the existing representative power control protocols are analyzed in detail, and their classifications, characteristics and performance are compared adequately. Finally, by including the status of current development, the restrictions to the practical applications of wireless sensor networks are summarized, and the importance of establishing an adaptive power control model based on the empirical studies and data analysis method are proposed.
LI Yu-Feng , QIU Han , LAN Ju-Long , WANG Bin-Qiang
Abstract:A brief summary of the effect of router buffers on network performance is presented. The previous buffer sizing works based on queuing theory are analyzed under two classes of traffic models, the models exhibiting long-range dependence and the models exhibiting short-range dependence. Another type of buffer sizing works are based on TCP models, and the rule of thumb, the small buffers rule, the drop-based buffers rule and the tiny buffers rule are the four main recent results in these works. The results of the four rules and some preliminary experiments to validate these rules are carefully summarized. Research directions and open problems are also discussed.
LI Chun-Hong , FENG Guo-Fu , GU Tie-Cheng , LU Sang-Lu , CHEN Dao-Xu
Abstract:Overlay Web caching (OCC) exploits resources of peers to provide scalable and cost-effective web caching service. In a typical OCC system, which is often characterized by highly heterogeneous node capacities and skewed query distributions, the resources of each node may be utilized in an unbalanced manner, i.e., some nodes are overloaded and become "hotspots". Unfortunately, there are no effective load balancing mechanisms in existing OCC systems to relief the "hotspots". This paper proposes a hotspots-free OCC scheme called HFOCC for multimedia content delivery service. Through replicating "hot" objects adaptively to lightly loaded nodes, loads are distributed more evenly across the whole network. Consequently, the hotspots are relieved. In order to utilize cache resource more effectively, HFOCC splits a node's cache space dynamically into two parts, namely the home cache and the replica cache, and manages them by a uniform policy. With a "soft" lifetime control mechanism, the redundant object replicas are deleted adaptively, and the system performs well under dynamically changing workloads. Experimental results show that HFOCC improves resource utilization and system throughput markedly.
Abstract:Due to the absence of centralized authority, the service reliability of wireless ad hoc network is seriously affected by selfish actions of the rational nodes during the packet forwarding. This paper proposes a repeated-game model of node behavior that takes account of the selfish nodes' future payoff expectations and their long-term desires for profit. An incentive-compatible condition under which the selfish one will be deterred from cheating by the subsequent punishments and then turn to cooperate is shown analytically. The impacts on the selfish nodes' behaviors, which are induced by their willingness for future collaboration, the parameter settings of punishment mechanism and the efficiency of misbehavior detection, are also discussed. Simulation results show that, the increase of network scale, the deterioration of node's collaborative patience and the low misbehavior detection efficiency will motivate entities toward self-interested action, but this tendency can be neutralized by a careful configuration of the punishment mechanism in the model.
XU Wei-Qiang , WU Tie-Jun , WANG Ya-Ming , ZHANG Yun-Hua , CHEN Ji-Ming
Abstract:It is vital to implement multicast congestion control in networks as multicast improves the link's transmission efficiency, but it is apt to cause network congestion. However, multicast congestion control designed for Internet isn't fit for Ad Hoc networks, due to the following features: (1) multi-hop wireless-based transmission resulting in the contention relationship between flows in both the time domain and the spatial domain; (2) and frequent node mobility leading to the time-varying network situations. In this paper, the notion of the link's interference set is introduced to describe the characteristics of the contention relationship between flows, and a nonlinear optimization problem is presented to formulate the multirate multicast congestion control problem for small time interval where network situations can be regarded as time-invarying. The penalty function method and the subgradient method are jointly applied to obtain the optimal solution, and thus the distributed iterative algorithm is proposed. On the basis of this distributed algorithm, an adaptive congestion control strategy for multirate multicast sessions (AC2M2) is proposed to cope with the time-varying network situations, by means of network status detection and receding optimization. Simulation results show that the proposed distributed algorithm can quickly converge to globally optimal solutions; and, compared to the AIMD algorithm in TCP-Reno, AC2M2 is more adaptive to time-varying network situations, and achieves better network performance.