NIE Shi-Qiang , WU Wei-Guo , ZHANG Xing-Jun , CAI Yi , XU Zhi-Wei
2017, 28(8):1929-1939. DOI: 10.13328/j.cnki.jos.005200 CSTR:
Abstract:Efficient and scalable object management is a major factor of the overall performance in the object-based storage system. This paper proposes an algorithm MJHAR (matrix-based jump hash algorithm for replication data) that can be applied to the heterogeneous storage system to support variable levels of the object replication. The algorithm runs very fast and moves more minimizing data when storage nodes changes. Experimental results show that the running time of MJHAR is less than that of the consistent hash by 40% and that of jump hash by 23%. The object distribution of MJHAR is more uniform than that of consistent hash algorithm.
DING Shang , TONG Xin , CHEN Yan , YE Bao-Liu
2017, 28(8):1940-1951. DOI: 10.13328/j.cnki.jos.005204 CSTR:
Abstract:In order to ensure the data reliability, fault-tolerant distributed storage systems usually apply some data redundant strategies such as the multi-replicas strategy and the erasure coding strategy. Compared with the multi-replicas strategy, the erasure coding strategy has an advantage of storing less data while costs much more network traffic when repairing a failed node. To reduce the repair network cost, the regenerating code and locally repairable codes were proposed, and simple regenerating code is a representative of the locally repairable codes. However, most of the current fault-tolerant distributed storage methods based on coding strategy assume that the storage nodes are located in the star shaped logic network, ignoring the physical network topology and link bandwidth capacity. So with applying the physical network topology into the repair process of erasure code and regenerating code, some related researches propose methods of building tree shaped repair paths to get further more efficiency for repairing a failed node. But because of the difference in encoding and repairing process, those methods are not fit for the repair of simple regenerating code. To resolve the problem, this paper introduces the link bandwidth capacity into the repair process of simple regenerating code based on the physical network topology. It builds a bandwidth-aware node repair analysis model, and proposes an algorithm to build parallel repair trees based on the optimal bottleneck path and optimal repair tree methods. Experiments are also designed to evaluate the performance of the algorithm. The result shows that compared with the method based on star shaped network topology, the proposed algorithm reduces the latency of repair process effectively.
SHE Chu-Yu , WEN Wu-Shao , XIAO Yang , LIU Yu-Bo , JIA Yin
2017, 28(8):1952-1967. DOI: 10.13328/j.cnki.jos.005199 CSTR:
Abstract:With the advent of big data era, global storage is experiencing an explosive growth. Traditional storage systems have several drawbacks in storage performance, storage capacity, data security and device cost. In order to handle large amount of data, the storage technology for cloud computing platform has undergone rapid development in the recent years, and become an important tool to deal with big data. This paper analyzes the shortcomings of metadata management of certain distributed file system and proposes an adaptive metadata load balancing mechanism. First, a real-time server performance evaluation model is introduced. Next, a period adaptive mechanism based on change of server load is built. Finally, an adaptive load balancing algorithm based on server performance is proposed. Experimental results demonstrate the practicability, availability and stability of the new mechanism.
LIU Xing , JIANG Song , WANG Yang , FAN Xiao-Peng , XU Cheng-Zhong
2017, 28(8):1968-1981. DOI: 10.13328/j.cnki.jos.005202 CSTR:
Abstract:Small and synchronous writes are pervasive in various environments and manifest in various levels of software stack, ranging from device drivers to application software. Given a block interface, small write can cause serious write amplifications, which can substantially degrade the overall I/O performance. To address this issue, this paper presents a block I/O scheduler, named Hitchhike. Hitchhike is able to identify small writes, and embed them into other data blocks through data compression. With Hitchhike, a writeback buffer for small synchronous writes can be enabled, not only removing the write amplification, but also dramatically improving the performance of small synchronous writes. Hitchhike is implemented based on the Deadline I/O schedulers in Linux 2.6.32, and evaluated by running Filebench benchmark. Testing results show that compared to traditional approaches, Hitchhike can significantly improve the performance of synchronous small writes up to 48.6%.
TANG Zhen , WU Heng , WANG Wei , WEI Jun , HUANG Tao
2017, 28(8):1982-1998. DOI: 10.13328/j.cnki.jos.005201 CSTR:
Abstract:As a new type of storage media, solid state drive (SSD) is widely used in virtualization environment. SSD is usually used as the read and write cache of the virtual machine (VM) storage to improve the disk I/O performance of the VMs. Existing SSD caching schemes mostly focus on the capacity planning of the SSD cache and use metrics such as cache hit rate to evaluate the effect of SSD cache allocation. Since they do not consider the limitations of service capabilities of SSD, which may lead to the contention of cache resource and the performance degradation and violations among VMs, they are not suitable to be used with some typical distributed applications. This article proposes a self-adaptive SSD caching system for multiobjective optimization in virtualization environment, to reduce the resource contention, and take the limitations of service capabilities of SSD into consideration. With the help of closed loop adaption, it can dynamically detect the status of VMs and applications. Moreover, it continuously detects the contention of SSD cache, generates the migration plan using clustering algorithm and decides the timing and order of the VM migrations according to the capabilities of SSD as well as the characteristics and requirements of applications. The evaluation shows that when facing the scenarios of using some typical distributed applications, the contention of SSD cache resource is reduced and the requirements of applications are considered, which lead to the improvement of performance and reliability of applications. For Hadoop applications, the execution time of jobs is reduced by 25% on average, and the throughput for I/O sensitive applications is improved by 39%. For ZooKeeper applications, the service outage caused by the single point of fault of the hypervisor can be handled at the cost of less than 5% of performance degradation.
CAO Shun-De , HUA Yu , FENG Dan , SUN Yuan-Yuan , ZUO Peng-Fei
2017, 28(8):1999-2009. DOI: 10.13328/j.cnki.jos.005203 CSTR:
Abstract:This paper presents a high-performance distributed storage solution for video surveillance data via analyzing the characteristics of video surveillance data and conventional file storage solutions. This design proposes a logical volume structure rather than file-based storage to efficiently organize unstructured video stream data. The scheme hence directly writes these stream data into raw disk devices, to address the problem of storage performance decrease caused by the random access and disk fragmentation in traditional storage systems. A two-stage index strategy is also implemented to manage metadata by the state manager and storage servers, which significantly reduces the amount of metadata managed by the state manager, eliminates the performance bottlenecks, and provides the second-level video retrieval accuracy. Moreover, the design has the salient features of fault tolerance and linear scaling abilities with the help of the flexible storage server grouping policy and the mutual backup relationship in a storage group. Experimental results show that the solution can simultaneously record 400 ways of 1080P video streams with a single low-cost PC server, and the system's average write speed is 2.5 times faster compared with the local file systems.
QIN Sheng-Chao , XU Zhi-Wu , MING Zhong
2017, 28(8):2010-2025. DOI: 10.13328/j.cnki.jos.005272 CSTR:
Abstract:Since 1960s, automated program verification has been considered as an extremely difficult topic in Computer Science, despite of the emergence of Floyd-Hoare logic. Since 1990s, with more efforts and resources devoted to this area, especially the contributions from industry communities including Microsoft Research and IBM Research Centre, there has been noticeable progress made in program verification early this century. Two typical examples include ASTRÉE, used to verify the absence of runtime errors in Airbus code, and SLAM, employed to verify protocol properties of procedural calls in device drivers by Microsoft. However, none of these tools considers heap. Specifically, ASTRÉE assumes no dynamic pointer allocation and no recursion, while SLAM assumes memory safety. Many important programs/systems that exist nowadays, such as Linux, Apache, and device drivers, all make frequent use of heap. Automated verification of heap-manipulating programs remains a very challenging topic. The emergence of separation logic in 2001-2002 has shed some light into this area. With the key idea of separation and the elegant frame rule, local reasoning can now be readily employed in program verification. Since 2004, there have been a large body of research work dedicated to automated program verification via separation logic, e.g. SpaceInvader/Abductor, Slayer, HIP/SLEEK, CSL etc. This paper offers a survey on a number of important research work along this line.
PENG Guo-Jun , LIANG Yu , ZHANG Huan-Guo , FU Jian-Ming
2017, 28(8):2026-2045. DOI: 10.13328/j.cnki.jos.005270 CSTR:
Abstract:Within the current commercial system achitecture and software ecosystem, code reuse techniques, such as ROP (return-oriented programming), are widely adopted to exploit memory vulnerabilities. Driven by the serve situation of cyberspace security, academical and industrial communities have carried out a great amount of research on binary code reuse from both defensive and offsensive perspevtives. This paper discusses the essence and basics of binary code reuse, along with an analysis of its technique roadmap and typical attack vectors. Corresponding defences and mitigations based on control flow integrity and memory randomization are analyzed as well. Dissections on CET (control flow enforcement technology) and CFG (control flow guard), two latest industrial techniques for binary code reuse mitigation, are presented. The future of binary code reuse, including protential attack vectors and possible mitigation strategies, is also discussed at the end of this paper.
GAO Wei , LI Ying-Ying , SUN Hui-Hui , LI Yan-Bing , ZHAO Rong-Cai
2017, 28(8):2046-2063. DOI: 10.13328/j.cnki.jos.005121 CSTR:
Abstract:SIMD extension is an acceleration component integrated into the general processor for developing data level parallelism in multimedia and scientific computing applications. Control dependence hinders exploiting data level parallelism in the programs. Current vectorization methods in the presence of control flow, loop-based or SLP, all need if-conversion. They do not consider the SIMD parallelism in programs, and thus result in a lower SIMD executing efficiency. Another factor that leads to low efficiency is the lack of cost model to direct SIMD code generation in the presence of control flow. To address these problems, an improved control flow SIMD vectorization method is proposed. First, loop distribution with control dependence is put up to separate the vectorizable parts and unvectorizable parts, taking data locality into account simultaneously. Second, a direct vectorization method of control flow is presented considering the reuse of data between basic blocks. Finally, cost model is used to guide the generation of select and BOSCCs to improve the efficiency of SIMD code. Experimental results show that the code performance generated by the improved methods increase by 24% compared with those of the existing control flow vectorization methods.
WU Ze-Zhi , CHEN Xing-Yuan , YANG Zhi , DU Xue-Hui
2017, 28(8):2064-2079. DOI: 10.13328/j.cnki.jos.005126 CSTR:
Abstract:Despite the demonstrated usefulness of dynamic taint tracking techniques in mobile privacy security, poor performance attained by prototypes is a big problem. A novel optimization methodology for dynamic taint tracking based on just-in-time compilation is presented. First, the taint propagation logic is separated from the program logic precisely to simplifying the complexity of the taint propagation analysis. Then, a taint propagation framework is proposed and the correctness of the taint propagation analysis is proved..Finally, redundant and inefficient taint propagation codes are transferred to efficient and equivalent codes by adopting the methods of eliminating, replacing and moving. Experimental results show that 38% of memory usage and the time of execution of taint tracking instructions are saved for every single hot trace, and on average the performance of dynamic taint tracking system is improved 6.8%.
LIU Jie , HUANG Jin , TIAN Feng , HU Wei-Ping , DAI Guo-Zhong , WANG Hong-An
2017, 28(8):2080-2095. DOI: 10.13328/j.cnki.jos.005123 CSTR:
Abstract:This article analyzes the application and existing problems of touch interaction for mobile handsets and wearable device. Based on time continuity and space continuity of interactive action, a hybrid gesture input method with characteristic and advantages of air gesture and touch gestures is proposed by combining the contact trajectory and in-air trajectory of the interactive action. Based on the concept of continuous interaction space, hybrid interaction gestures, air gestures, and surface touch gestures are unified to establish a continuous interaction space hierarchical processing model including the air layer, surface layer and hybrid layer. The definition of information data and the transformation process are then provided. The general gesture recognition framework is also constructed. Furthermore, the trajectory segmentation method and gestures classification methods are introduced in detail. Finally, an application instance is designed, and the availability of the hybrid interaction gestures and feasibility of the continuous interaction space hierarchical processing model are verified. Experimental results show that the presented hybrid gesture interaction model possesses advantages of both air gesture and touch gestures, and has good recognition efficiency and better spatial freedom.
NIU Dang-Dang , LIU Lei , LÜ Shuai
2017, 28(8):2096-2112. DOI: 10.13328/j.cnki.jos.005125 CSTR:
Abstract:HER (hyper extension rule) is an expansion of ER (extension rule). The results of computing the union and difference set of two sets of maximum terms which are extended by two clauses respectively based on the hyper extension rule can be saved as EPCCL (each pair of clauses contains complementary literals) theories. In this paper, a parallelizable knowledge compilation algorithm for EPCCL theories, IKCHER, is proposed based on computing the intersection of maximum terms sets. The focus is placed on the research of parallel computing the intersection sets of multiple EPCCL theories based on HER, resulting in the design of the corresponding algorithm, PIAE (parallel intersection of any number of EPCCL). Through using the origin CNF formulae of EPCCL theories, another efficient merging algorithm imp-PIAE is proposed. Based on above methods two parallel knowledge compilation algorithms P-IKCHER and impP-IKCHER are constructed for EPCCL theories, utilizing the PIAE algorithm and imp-PIAE algorithm to merge multiple EPCCL theories respectively. Experimental results show that IKCHER algorithm has the best compilation quality of all EPCCL compilation algorithms. P-IKCHER algorithm does not improve the compilation quality and compilation efficiency of IKCHER while impP-IKCHER algorithms can maintain the compilation quality of IKCHER, and improve the compilation efficiency of IKCHER. When the number of CPU cores is 4, the compilation efficiency of IKCHER can be improved twice as much in the best cases.
LIU Bo , JING Li-Ping , YU Jian
2017, 28(8):2113-2125. DOI: 10.13328/j.cnki.jos.005135 CSTR:
Abstract:With the development of video acquisition and transmission technologies, and the widespread applications of mobile terminal devices, more and more set-based images are available. The key issue of image set classification is how to measure the distance between two sets over the complexity of inner structure of the set. To address this problem, this paper presents a framework, called double sparse regularizations for image set distance learning (DSRID). In DSRID, the distance between two sets is calculated by the distance between two prominent sub-structures in each set, which enhances the robustness and discrimination of the measure. According to different set representations, this framework is implemented in traditional Euclidean space and two common manifolds, i.e., symmetric positive definite matrices manifold (SPD manifold) and Grassmann manifold. Extensive experiments demonstrate the effectiveness of the proposed method on set-based face recognition, action recognition and object categorization.
KANG Yan-Li , LI Feng , WANG Lei
2017, 28(8):2126-2147. DOI: 10.13328/j.cnki.jos.005107 CSTR:
Abstract:Analytical query is an important way to get value from big data in data warehouse. With the growth of data, the same query needs to be executed periodically, which inevitably introduces redundant calculation on historical data. One type of incremental optimization technology reduces redundant calculation by reusing intermediate results of historical data. However it has following problems:1) it isn't transparent for user; 2) choice of historical result storing/reusing position is not intelligent; and 3) optimization gains is limited. This article designs an incremental optimization method, which is guided by the semantic rules. This method focuses on both user transparency and optimization gains, and extends grammar to support incremental description. Historical result storing/reusing location is firstly chosen by operators' operational semantics and output semantics. Positions are then adjusted according to cost model and physical task's division positions. At last, optimized tasks-DAG is generated with the ability to run in a distributed computing framework (such as MapReduce) periodically. This paper implements a prototype, called HiveInc, based on Apache Hive. Experimental results on TPC-H show that, compared to non-optimization, HiveInc can obtain average 2.93 speed-up and highest 5.78 speed-up. Compared to classical optimization techniques, IncMR and DryadInc, speed-up of 1.69 and 1.61 can be obtained respectively.
YU Fei , LI Zhi-Jun , CHE Nan , JIANG Shou-Xu
2017, 28(8):2148-2160. DOI: 10.13328/j.cnki.jos.005118 CSTR:
Abstract:Along with the development of online social networks, friend recommendation becomes the favor of the major social networks. It can help people to meet new friends for expanding the scale of social network, which in turn allows people to receive more information from their friends. Therefore, friend recommendation should be focused on expanding the scale of social network and obtaining information form recommended friends. However, existing friend recommendation methods barely consider the people information need, and they are mainly based on the simple and limited user profiles, and are agnostic to users' offline behaviors in the real world. Because human activity in the physical world has a spatial locality, the recommended friends through the existing recommendation methods are limited in geographic space which the target user know. As a result, the recommendation cannot provide more new information on geography to meet the target's need on information. This paper first proposes a new friend recommendation method based on the offline check-in behaviors in the real world instead of the online user profiles, and mines check-in behavior similarity between users in the real world. The essential goal of friend recommendation is to provide users with more new information. In order to meet the requirements of user getting more geographical location information, the recommendation systems can recommend the strangers in broader check-in geography distribution for the target users. Meanwhile, when the recommended friends and the target users have the similar check-in behaviors, it is more probable for the users to accept recommended strangers. Kernel density estimation (KDE) is used to estimate each user's check-in behavior probability distribution and the time entropy to filtering some noise that have side effects on overall check-in behavior similarity, then the recommended strangers who can bring a wider range of new strangers geographic information for the target users are selected. Lastly, a large-scale user check-in data-set of Foursquare is used to validate recommendation precision and the degree of information expanding of this approach. The experimental results show that the proposed approach outperforms the existing friend recommendation methods on the aspect of the information expanding degree while maintaining the recommendation precision of the state-of-the-art stranger recommendation methods.
ZHANG Ping , WANG Li-Wei , PENG Zhi-Yong , YUE Kun , HUANG Hao
2017, 28(8):2161-2174. DOI: 10.13328/j.cnki.jos.005120 CSTR:
Abstract:Influence maximization aims at finding a set of influential individuals (i.e. users, blog etc.) in a social network. Most of the existing work focused on the influence of individuals under the hypothesis that the influence relationship between the individuals is known in advance. Nonetheless, it is often the case that groups (i.e. area, crowd etc.) are only natural targets of initial convincing attempts in many real-world scenarios, such as billboards, television marketing and plague prevention. In this paper, the problem of locating the most influential groups in a network is addressed. (1) Based on the discovery of the group associations, GIC (group independent cascade) model is proposed to simulate the influence propagation process at the group granularity. (2) A greedy algorithm called CGIM (cascade group influence maximization) is introduced to determine the top-k influential groups under GIC model. Experimental results on both synthetic and real datasets verify the effectiveness and efficiency of the presented method.
JIANG Tao , LI Zhan-Huai , SHANG Xue-Qun , CHEN Bo-Lin , LI Wei-Bang , YIN Zhi-Lei
2017, 28(8):2175-2195. DOI: 10.13328/j.cnki.jos.005124 CSTR:
Abstract:The advances of microarray technology have made large amount of gene expression data available from a variety of different experimental conditions. Analyzing the microarray data plays a key role in understanding gene functions, gene regulation and cellular process. Order-Preserving Submatrix (OPSM) is an important model in microarray data analysis, which captures the identical tendency of gene expressions across a subset of conditions. In the process of analyzing mechanism of gene expression, OPSM search undoubtedly saves the time and effort of biologists. However, OPSM retrieval mainly depends on keyword search, resulting a weak control on the obtained clusters. Typically, the analyst can determine the ad-hoc parameters which are far from the declarative specification of desired properties on operation and concept. Motivated by obtaining much more accurate query relevancy, this paper proposes two types of OPSM indexing and constrained query methods based on signature and Trie. Extensive experiments conducted on real datasets demonstrate the proposed methods have better behaviors than the state-of-the-art methods on efficiency and effectiveness.
LU Fei-Fei , XIE Xiang-Hui , GUO De-Ke , ZHU Gui-Ming
2017, 28(8):2196-2213. DOI: 10.13328/j.cnki.jos.005119 CSTR:
Abstract:The modular data center networks comprise inter-and intra-module networks. The structure and routing of inter-module networks take charge of organizing modules, and communicating among servers in different modules. Thus, it is the most important issue to design an inter-module structure with high bandwidth, high fault-tolerance, and flexible scalability to build the mega-modular data center. In this paper, an inter-module network structure called MDKautz is proposed for building the mega-modular data center. MDKautz uses Kautz as the basic interconnection topology among modules, and builds the mega-modular data center network with high bandwidth, high fault tolerance and good scalability without the need of any additional high end switches. The method for how to constructing, routing and expanding in MDKautz are analyzed. Mathematical analysis and experiment results show that MDKautz has a good topology characteristic and communication performance, and can effectively support the typical applications with high bandwidth and high fault tolerance in data center.
LU Ji-Yuan , LIU Yu-Xi , HOU Fang , HUANG Cheng-Hui , CHAO Hong-Yang
2017, 28(8):2214-2226. DOI: 10.13328/j.cnki.jos.005122 CSTR:
Abstract:Separately optimizing different video coding technologies simplifies the implementation of coding algorithms, but it results in slowdown by not taking their internal relation into consideration. An integrated algorithm is proposed to jointly improve the speed of fractional pixel interpolation (FPI) and fractional pixel motion estimation (FPME) for HEVC. Different fractional pixels are distinguished by their interpolation filters to construct a FPME algorithm with the knowledge of their interpolation cost. The FPME algorithm produces a cost effective order for refinement search which checks fractional search positions (SPs) by not only maximizing the expectation for the R-D gains but also minimizing the computational cost at the same time. On the other hand, an on-demand FPI algorithm for HEVC is also proposed to save redundant interpolation as well as reduce duplicate calculation. Experimental results show that the introduced algorithm significantly improves the overall speed of FPME and FPI with negligible coding lost.