ZHOU Tian-Yang , ZHU Jun-Hu , LI He-Shuai , WANG Qing-Xian
Abstract:Hardware Virtualization-Based Rootkit (HVBR) is one of many new malwares appearing over the years. Compared to the traditional Rootkit, HVBR is stealthier and more difficult to detect. This paper analyzes the concealment and working mechanism of HVBR. By aiming at the stealth of HVBR on bypassing virtual memory scan to counter detection, this paper proposes a detection approach, based on physical memory search. The approach modifies Page Table Entry (PTE) to traverse the physical memory, and matches the fixed characteristic of HVBR with the raw memory data to detect and locate HVBR in memory. The experimental results show it is reliable and efficient.
Abstract:This paper focuses on attacking the construction problem and reports research on the colluded attack construction method for free roaming mobile agent data integrity protection protocol with data integrity definition. The formal specification method, and suggests a new attack on Cheng-Wei protocol. In this attack, two colluded hosts can truncate the data collected by using a mobile agent in some probability. This kind of attack is effective on mobile agent data integrity protection protocol based on proof chain.
ZHANG Lian-Cheng , WANG Zhen-Xing , XU Jing
Abstract:Watermark carriers of existing network flow watermarking schemes are limited to packet payload, traffic rate, and packet timing. However, packet payload is based on flow watermarking schemes, which depend on specific application protocols, such as telnet and rlogin, but encryted traffic and are invisible to traffic interceptors. At the same time, traffic rate and packet timing based ones are vulnerable to timing perturbation introduced by network transmission and attackers. Even worse, most of them have a low watermark capacity and are visible to multi-flow attack, mean-square autocorrelation attack and timing analysis attacks. This paper utilizes packet order as a watermark carrier and proposes a novel packet reordering based flow watermarking (PROFW) scheme. To achieve robustness against packet out-of-order pertubation, a theory of error correcting code is introduced into watermark encoding. Meanwhile, this paper utilizes a stochastic modulation approach to increase the stealthiness of PROFW scheme by controlling packet reordering degree not exceeding normal levels. Empirical results prove its robustness against timing and packet out-of-order pertubations, introduced by network transmission and deliberately by attackers. Compared with typical flow watermarking schemes, PROFW scheme, which has a higher watermark capacity, is more robust against timing and packet out-of-order pertubations.
LIU Jian-Xiao , HE Ke-Qing , WANG Jian , FENG Zai-Wen , NING Da
Abstract:In the modern world of service-oriented software engineering (SOSE), the services can be aggregated from the semantic interoperability level to meet the user’s personal and diversified needs. First, the paper proposes a service clustering method based on service ontology. It clusters services from the function perspective to form the service clusters. This can significantly reduce the overhead and enhance the service discovery efficiency. In addition, it makes use of the service capability and the interaction information to organize the service clusters form the semantic interoperability level. Furthermore, it discusses the problem of sufficient and necessary capability, and interoperability type. The users can discover the services that can meet their needs efficiently. The corresponding service clustering and discovery algorithms are also designed. Finally, the feasibility and effectiveness of the proposed methods are validated through experiments and a practical case study.
MO Tong , CHU Wei-Jie , LI Wei-Ping , WU Zhong-Hai , LIN Hui-Ping
Abstract:With the development of service computing and Internet of things, software systems can discover and provide service to customer actively based on context information. Compared with traditional service discovery, the service requirement is unknown in active service discovery. The system needs to analyze current demands from customer’s context-aware information and choose the proper service to provide. By using this as the focus, an active service discovery method based on context-aware event is proposed. First, the change of context-aware information is defined as context event, and the relationship between context events is expressed by event driven graph. Second, an event-service FP-TREE is built by mining service log. On the basis of the two definitions, a service discovery algorithm is realized according to the current context-aware event. Experimental results show that comparing with broadcasting which is the general active approach; this approach can promote the ratio of precision of active service discovery.
YANG Zhi , WU Bu-Dan , CHEN Jun-Liang
Abstract:Web Service is becoming the next generation of web-based application. With enhancement of quality of services and increasing quantity of services, how to recommend the suitable services according to personalized requirement becomes an urgent question. In the existing approaches of service recommendation, the result of service recommendation is the service list in which there is not evaluation standard that can be used to distinguish services with high relevancy or low relevancy. Therefore, in the real-world, users may obtain low relative services. To address the aforementioned problems, in this paper, membership function is analyzed and recommendation measure standard is proposed. With dynamic programming theory, an ontology-based approach of service recommendation is provided. In the result of service recommendation, membership as measure index is used to divide high relative services and low relative services. High relative services are recommended to the user, so the recommended services are accurate and available.
GE Liang , ZHANG Bin , LIU Ying , LI Fei
Abstract:With the widespread adoption of SOA in large-scaled distributed systems, Service-Based Software system has brought much attention in the field of software engineering. Capable of monitoring system status and dynamically adjusting accordingly, Adaptive Service-Based Software system enhances SBS with self-adaptive ability to satisfy the system requirements in aspects of QoS, etc. This paper defines the structural description of ASBS, proposes a performance evaluation model based on reflective Petri net, and introduces the constructing and analyzing approaches for the performance model, which also incorporates explorations in ASBS performance analysis.
WANG Yun-Tao , YU Chun , QIN Yong-Qiang , SHI Yuan-Chun
Abstract:This research designs and implements a marker recognition system that allows detection of tangible objects on a very large tabletop system. The design of the system is challenged by constraints including minimal distinguished size, non-linear imaging of the camera and areas of blind region caused by initiative IR illumination. After iterating through several designs, the study proposes the uMarker, an image-coding paper marker. Compared to the widely used reacTIVision, it is 75.7% smaller in size and 16.7% higher in coding capacity. The marker recognition system recognizes uMarker through image processing, which achieves good performances on capacity, recognition accuracy and efficiency. Furthermore, it can distinguish between fingers and uMarker and remove duplicated markers. In conclusion, uMarker extends the tabletop system with the ability to recognize tangible objects including information about location, direction and category.
ZHUANG Lian-Sheng , GAO Hao-Yuan , LIU Chao , YU Neng-Hai
Abstract:Feature quantization is an important component in Bag of word model. This paper proposes a novel method called nonnegative sparse locally linear coding (NSLLC) to improve the performance of locally linear coding. The core ides of NSLLC is to use nonnegative sparse representation to select the nearest neighbors in the same subspace and then encode the local feature with respect to the local coordinate consisting of these nearest neighbors. Experimental results have shown NSLLC has outperformed state-of-the-art local feature coding methods and is in favor of image classification problem.
GENG Hui-Dong , YU Zhi-Wen , ZHANG Xin-Xin , XIA Yun-Yun , WANG Hai-Peng
Abstract:As people grow older, their memory will continue to decline. They often can not find some items in their daily lives. This paper presents an object searching system in smart home. Similar to the Web search engines, it can return some relevant information about the searched object for the user. The system uses UWB devices for indoor object localization. It obtains the user’s current context information by using various sensors in smart home. Thus, with the user’s original search input, the system can infer the user’s real search intention, and then offer intelligent search services.
MA Jun , CAO Jian-Nong , MA Chao , TAO Xian-Ping , Lü Jian
Abstract:Context awareness is one of the key characteristics of ubiquitous and pervasive computing. Most current formalism works focus on two aspects: context representing and system modeling. However, how the temporal property of context can be modeled and how context can be manipulated are not well addressed. This paper proposes a formal way based on set theory for modeling and manipulating context, in which a context is defined as a set of context entries and operators are introduced to specify how to manipulate contexts in according to the needs of different applications. To show the usability of the proposed model, a demo implementation is also included in this paper.
DU Yi , TIAN Feng , DAI Guo-Zhong , WANG Feng , WANG Hong-An
Abstract:It’s very important to help developers design user interface for application in an intelligent user interface. Nowadays, the number of mobile based applications increases greatly, but there are no proper user model to guide the design and development of user interface. This article takes different types of functions and parameters in consideration, and proposes a user model and user modeling method based on activity theory. In addition, this article describes the improved VSM algorithm in detail. Finally, an application and an informal experiment are designed to prove the efficiency of given user model. This work can be used to guide the design and development of mobile user interface under mobile environment.
XU Yu-Qiong , SHEN Zong-Jia , PAN Gang , LI Shi-Jian
Abstract:The rise of “Multiple devices per user” computing model and the mobility requirement of moving across physical environments bring great challenges to task migration in four aspects: continuity, transparency, heterogeneity and generality. This paper proposes the concept of “virtual user space”, and establishes a task migration framework based on a virtual user space, called TaskShadow-V, to address these challenges. Virtual user space guarantees the continuity of user tasks and capabilities of migrating across mobile devices. Meanwhile, the context-based mechanism of intelligent migration decision-making is proposed to ensure transparency of task migration. This mechanism employs virtual user space as the migrating granularity and uses context information to automatically make the decisions of migration. The study conducts an experiment to verify the effectiveness of proposed TaskShadow-V framework.
CHEN Yi-Qiang , LI Qiu-Shi , LIU Jun-Fa , HU Kun , CHEN Zhen-Yu
Abstract:Traditional context-aware systems on mobile platform mainly focus on utilizing various localization based technologies to detect and recognize significantly meaningful places. However, they cannot intuitively describe the dynamic semantic context of the surroundings. In this paper, a novel context sensing approach is proposed to distinguish typical context based on dynamic Bluetooth information. The study builts a context classification model through observing the occurrence of ambient Bluetooth devices and dynamic statistical features extraction and further applied the model into inferring semantic social context based on Bluetooth traces from real-world personal lives. Evaluation results show, just based on dynamic Bluetooth information, the proposed feature extraction methods and DT (Decision Tree) can achieve an average accuracy of 86.8% for recognizing six representative short time-length contexts, which outperforms several traditional machine learning methods. In addition, the accuracy of long time-length context inferring can also reach 92% without any additional information but Bluetooth.
HU Jia-Feng , JIN Bei-Hong , ZHUO Wei , CHEN Hai-Biao , ZHANG Li-Feng
Abstract:Currently, various location-based applications, such as dynamic location alarm services and location-based shopping promotion services, etc. emerge with bright prospects. Among these applications, the kernel technique is spatial event detection. The paper adopts Pub/Sub middleware to detect spatial events, presenting the basic detecting method. Moreover, the paper presents the optimization strategy of detecting the events which are matched with unary location subscriptions. Specifically, the paper explores the relations between the regions covered by events and builds the multilevel indexes to improve the processing of spatial events. On the other hand, it utilizes the computational capacities of client computers and calculates and maintains the safe regions on the clients. Through this system, the events can be filtered on the clients, and the workloads on the server can be reduced. The paper has also conducted the simulation experiments on the system which implements the proposed speed-up strategies to evaluate its performance and costs. The experimental data show the speed-up strategies can efficiently accelerate the processing of spatial event detection.
GAO Xing-Yu , CAO Xiao-Lin , ZHAO Wei-Bo , ZHANG Ai-Qing , MO Ze-Yao
Abstract:Oriented to the large-scale computation on tens of thousands cores, the parallel software infrastructure named JASMIN has released a new version which has improved the enabling techniques and numerical algorithms. With downward compatible programming interfaces, the new version can enhance the scalability of the programs free of application users’ effort. To investigate the scalability of the programs based on JASMIN, we test and analyze the performance of five complex application programs on tens of thousands cores of TH-1A supercomputer. These programs were developed for the high-performance computation arising from inertial confinement fusion, material science as well as the high-power microwave. It is shown that four programs achieve a parallel efficiency of over 60% on 42,000 cores and three ones achieve a parallel efficiency of over 45% on 84,000 cores.
YAN Shen-Gen , ZHANG Yun-Quan , LONG Guo-Ping , LI Yan
Abstract:Reduction algorithm has a wide range of applications in areas such as scientific computing and image processing. This paper systematically studies the reduction algorithm optimization on the GPU’s cross-platform performance optimization based on the OpenCL framework. Previous research has generally focused on a single hardware architecture, however, this paper based on the OpenCL, studies various kinds of optimization methods, such as using vector, on-chip memory bank conflict, threads organization, instruction selection and so on. The research takes the minMax function for example, dilatationed each optimization method for develep the performance, and detailed the reason. The study tests the algorithm both on AMD GPU and NVIDIA GPU platforms. The test results show that the optimized algorithm on both platforms has achieved good performance. In the AMD ATI Radeon HD 5850 platform, Int and Float types of data bandwidth utilization up to 89%. In the NVIDIA GPU Tesla C2050 platform, the performance has reached 1.3 to 1.9 times compare to appropriate function version of CUDA.
LI Chao , ZHANG Yun-Quan , ZHENG Chang-Wen , HU Xiao-Hui
Abstract:Intensity model with blur effect is widely employed to accurately simulate the imaging process of star simulator used for attitude determination and guiding feedback. It imposes great demands of computing power for realistic domains, and modern Graphics Processing Units (GPUs) have demonstrated to be a powerful accelerator for this kind of computationally intensive simulations. This paper presents a parallel design and implementation of the intensity model applied to large-scale star simulators on GPUs using the compute unified device architecture (CUDA) programming model. The study analyzes the double parallel nature inherent in this model and use the CUDA framework to efficiently exploit the potential fine-grain data parallelism. Two versions of simulator are designed and studied: One is sequential simulator used as the baseline simulator, and another is parallel simulator using CUDA. In parallel strategy, model, and GPU implementation level, the study employs specific optimized strategies to efficiently improve the parallel performance. Finally, two benchmarks corresponding with the double parallelism are developed to fully evaluate the performance behavior of our simulators. The result analysis demonstrates the efficiency of the CUDA simulators and also illustrates the restriction and bottlenecks presented in this simulator.
HU Kai , CHEN Lu-Jia , WANG Zhe , JIANG Shu
Abstract:The interconnection network’s design of large-scale parallel computer systems is significant to the execturion of the efficiency of parallel programs. Currently, the Petaflop supercomputers usually have more than ten thousand computing nodes, which cause new challenges to the performance of interconnection network. However, most exited studies only consider simple network workload models which have many differences from the workloads of real parallel applications. This paper presents complex workload models which are more similar to the practical network workload. Then based on the mathematical model of the interconnection network in the earlier study, the study uses a flit-level network simulator to support analyzing those complex workload models. Finally, through a great deal of experiments the performance of Torus and Fat Tree network topologies are compared under different workload models. Meanwhile, the message mean latency of the 3D FFT parallel algorithm using 2D partitioning is simulated. The results and related analysis efficiently support the large-scale interconnection network’s design, as well as the optimization of parallel programs.
WU Hong , ZHAI Yan , ZHAI Ji-Dong
Abstract:The model of MPMD program, which is based on MPI, is quite complex and includes several SPMD programs and their coupler. MPMD model is quite popular in climate domain, and it will be quite practical for the developer to understand its characters. This work focuses on the performance of the coupler module of MPMD program CCSM3.0 to locate the possible load-imbalance problem among the subprograms. The load balance issue of complex MPMD program is simplified down to issues of a set of SPMD programs and their interactions, providing good vision for the developers and performance diagnosing individuals to optimize the program.
LIU Yong-Peng , ZHU Hong , LU Kai , CHI Wan-Qing , LIU Yong-Yan
Abstract:Power consumption is a huge challenge for large scale systems, and power capping is an important goal of power management. Power overspending accumulative ratio ΔP×T is introduced as the metric to power capping. For large scale systems, NInO model is proposed to control the power consumption of clusters. Two example algorithms, NInO-P and NInO-ΔP, are designed based on NInO. Finally, the power capping with NInO is proved in experiments.
ZHANG Xian-Yi , WANG Qian , ZHANG Yun-Quan
Abstract:BLAS is a fundamental math library in scientific computing. Thus, each CPU vendor releases optimized BLAS library for its own CPU. Loongson CPU series are developed by the Institute of Computing Technology, Chinese Academy of Sciences. In 2010, it released Loongson 3 CPU series. This paper introduces the open source BLAS library OpenBLAS, which is forked on GotoBLAS 2-1.13 BSD version. BLAS Level 3 functions of OpenBLAS is optimized on Loongson 3A quad cores CPU. In sequential optimizations, blocking, hand coding assembly kernel, Loongson 3A special instructions and reordering instructions are utilized. The performance of BLAS Level 3 subroutines exceeded GotoBLAS and ATLAS by about 75% and 17%. Meanwhile, it exceeded GotoBLAS and ATLAS by about 103% and 36% in double precision functions. In parallel multi-threads optimization, this study used interleaved data buffer layout to avoid shared L2 Cache conflictions among multi-threads. OpenBLAS achieved 3.47 speedups on quad cores. In 4 threads, the performance of OpenBLAS BLAS Level3 functions exceeded GotoBLAS and ATLAS by about 69% and 34%, 89% and 55% in double precision functions.