• Volume 29,Issue 12,2018 Table of Contents
    Select All
    Display Type: |
    • O(1.890n) Exact Algorithm for a Class of Separable SAT Problems

      2018, 29(12):3595-3603. DOI: 10.13328/j.cnki.jos.005378 CSTR:

      Abstract (3741) HTML (1446) PDF 1.01 M (4176) Comment (0) Favorites

      Abstract:The Boolean satisfiability problem (SAT) refers to whether there is a truth assignment that satisfies a given Boolean formula, which is the first confirmed NP complete problem that generally does not exist a polynomial time algorithm unless P=NP. However many practical applications of such problems often take place and are in need of an effective algorithm to reduce their time complexity. At present, many scholars have studied the problem of SAT with clause length not exceeding k (k-SAT). From global search to local search, a large number of effective algorithms, including random algorithm and determination algorithm are developed, and the best result, including probabilistic algorithm and deterministic algorithm for solving k-SAT problems, is that the time complexity is less than O((2-2/k)n), and when k=3 the time complexity of the best algorithm is O(1.308n). However, there is little literature about SAT problems that are more general than clause length k. This paper discusses a class of separable satisfiability problems (SSAT), in particular, the problem of 3-regular separable satisfiability (3-RSSAT) where the formula can be separated into several subformulas according to certain rules. The paper proves that 3-RSSAT problem is NP complete problem because any SAT problem can be polynomially reduced to it. To determine 3-regular separability of the general SAT problem, an algorithm is given with time complexity is no more than O(1.890n). Then by using the result in the matrix multiplication algorithm optimal research field, an O(1.890n) exact algorithm is constructed for solving the 3-RSSAT problem, which is the WELL algorithm independent of clause length.

    • Research on Large Integer Multiplication Schönhage-Strassen Algorithm's Multi-Core Parallelization

      2018, 29(12):3604-3613. DOI: 10.13328/j.cnki.jos.005308 CSTR:

      Abstract (2408) HTML (1530) PDF 1.18 M (5026) Comment (0) Favorites

      Abstract:Schönhage-Strassen algorithm (SSA) based on the number-theoretic transform is one of the faster large integer multiplication algorithms widely used in the practical applications at present. Firstly in this paper, the principle of the SSA algorithm is introduced in detail. Then, parallel optimization is applied to SSA algorithm from a fine-grained perspective in the multi-core platform. The parallel SSA algorithm is implemented based on the open source library of large integer arithmetic algorithm GMP, and its correctness and performance is validated in the Intel X86 platform. The maximum speedup can reach 6.59 and the average speedup is 6.41 by 8 threads. The scalability of the parallel SSA algorithm is tested on the Inspur TS850, and experimental results show that it has good scalability and the maximum speedup can reach 21.42.

    • SLA-Aware Fault-Tolerant Approach for Transactional Composite Service

      2018, 29(12):3614-3634. DOI: 10.13328/j.cnki.jos.005313 CSTR:

      Abstract (2818) HTML (1501) PDF 2.27 M (4877) Comment (0) Favorites

      Abstract:Addressing the status quo that fault-tolerant logic of composite service is not separated from execution logic and service level agreement (SLA) violation appears frequently, this article proposes a SLA-based fault-tolerant approach for transactional composite services. Firstly, finite-state machine is adopted to model the execution process of the composite service and monitor the execution status. Secondly, monitoring automata is employed to monitor the SLA attributes during its execution to avoid SLA violation. Thirdly, an improved differential evolution algorithm is used to quickly determine the optimal recovery plan for the compensation process. Finally, a process is given to illustrate that as the approach is isolated from the execution logic of the composite service, it is easy to develop, maintain and update. The experimental results based on the real data sets show that the proposed approach is superior to other approaches in both the fault handling time and composition optimization. Meanwhile, the approach can deal with different fault scales.

    • Research on Reconfigurable Security System Modeling and Configuration Generation

      2018, 29(12):3635-3647. DOI: 10.13328/j.cnki.jos.005324 CSTR:

      Abstract (1887) HTML (1180) PDF 1.36 M (2963) Comment (0) Favorites

      Abstract:Reconfigurable security system with high flexibility, adaptability and scalability is a hot issue in the field of security research. At present, research on the reconfiguration mechanism is mainly based on the static reconfiguration method. As an active security defense method, it should have the ability of dynamic automatic reconfiguration. In order to solve the problem of modeling and describing dynamic and automatic reconfigurable security system, this paper proposes a new model, SSPE based on intuitionistic logic, and presents its syntax and inference rules. The transformation rules from the specification of security reconfigurable component to SSPE logic expressions are obtained by the method of mapping relationship. In the end, the paper describes the reasoning and generating process of security system reconfiguration based on IPSec protocol. Modeling and expression method based on intuitionistic logic can provide new ideas and methods for the research of reconfigurable security system, which is of great significance.

    • Towards Crowd Worker Selection for Crowdsourced Testing Task

      2018, 29(12):3648-3664. DOI: 10.13328/j.cnki.jos.005329 CSTR:

      Abstract (1907) HTML (1340) PDF 1.94 M (3246) Comment (0) Favorites

      Abstract:Crowdsourced testing is an emerging trend in software testing, which relies on crowd workers to accomplish test tasks. Thus, who performs a test task is extremely important for detecting bugs and covering key points of test requirements in crowdsourced testing. There are a lot of candidate crowd workers who may have different testing experience but can also produce duplicate test reports for the same task due to the lack of cooperation. As crowd workers can freely participate in a test task, high quality of testing in terms of bug detection and coverage of key points of test requirements is not guaranteed. Thus, to improve bug detection and coverage of key points of test requirements, selecting an appropriate subset of workers to perform a test task is becoming an important problem. In this paper, three motivating studies are first conducted to investigate important aspects of workers in detecting bugs and covering key points of test requirements. Accordingly, the studies identify three aspects:initiative, relevance and diversity are identified, and produce a novel approach for selecting workers considering all these three aspects. This new approach is evaluated based on 46 real test tasks from Baidu CrowdTest, and the experimental results show the effectiveness of the approach.

    • Tabu Search in Covering Array Generation

      2018, 29(12):3665-3691. DOI: 10.13328/j.cnki.jos.005369 CSTR:

      Abstract (1916) HTML (1725) PDF 2.89 M (3466) Comment (0) Favorites

      Abstract:Combinatorial testing can effectively detect faults caused by the interaction among the parameters of the system under test. In its 30 years of the development, covering array generation has been one of the key research areas, and relevant research articles have reached more than 200. As effective algorithms to generate covering arrays, existing tabu search algorithms have some advantages on the size of covering array, but there is still much room for improving the solution quality and calculation speed. Furthermore, the practical application of the existing algorithms is poor, because they can neither take account of constrains nor generate variable strength covering arrays. To solve the above problems, this paper proposes a new tabu search algorithm. Three improved aspects are presented. 1) The process of parameter tuning is divided into two stages:pair-wise and climbing to ensure that the optimal configuration is hit with a minimum number of configurations so as to further improve the size of covering arrays. 2) In order to improve the speed, the algorithm is parallelized. 3) Constrains and variable strength handling are added to make the algorithm adapt to various test scenarios. Experimental results show that the proposed algorithm has the advantage on the size of various covering arrays, such as fixed strength covering arrays, variable strength covering arrays and covering arrays with constraints. At the same time, the parallelization results in the increase of average speed of the algorithm by about 2.6 times.

    • Method for User Interface Development with Perceptual Control Theory Based Task Analysis

      2018, 29(12):3692-3715. DOI: 10.13328/j.cnki.jos.005432 CSTR:

      Abstract (1930) HTML (1669) PDF 2.74 M (3172) Comment (0) Favorites

      Abstract:Nowadays, a number of methods on model-based user interface development (MBUID) have been applied to deal with the diversity of users, devices, environments, and development platforms in a pervasive computing environment. In general, those methods attempt to specify a user interface once on an abstract level, and to run anywhere by use of model transformation. Due to the limitation of task model used in the current MBUID methods, however, it is still an open question whether the generated user interfaces can meet usability requirements in a divergent context of applications. In this paper, a task model based user interface development framework is proposed for modeling and implementing effective, efficient and satisfactory user interfaces. In order to cope with the usability requirements, a novel perceptual control theory based task analysis (PCTBTA) method is presented to specify the user tasks in a divergent environment, in which the context information is introduced into the task analysis process, and the interaction content is reflected at a higher level of abstraction, providing the task space for usability design. For model transformation, a method is provided for converting PCTBTA task model into a variety of interface models. Finally, a case study is provided to illustrate the feasibility of the proposed method, and the effectiveness of the method is demonstrated by comparing it with other methods in terms of availability and performance.

    • Time-Aware Web Service QoS Monitoring Approach Under Dynamic Environments

      2018, 29(12):3716-3732. DOI: 10.13328/j.cnki.jos.005488 CSTR:

      Abstract (3275) HTML (1148) PDF 1.99 M (3859) Comment (0) Favorites

      Abstract:Quality of service (QoS) is an important criterion to measure the quality of Web services, and it is an important aspect for users to choose Web services. This paper proposes a dynamic weighting Web service QoS monitoring method based on information gain and sliding window mechanism. IgS-wBSRM initializes the environmental factors' weights with a certain amount of initial training samples. It also employs the theory of information entropy and gain to determine the chaotic state of the samples. IgS-wBSRM reads the sample data flow in sequence, calculates the information gain of each impact factor combination after the arrival of sample data unit. Then it updates the initialized weights with TF-IDF in a dynamic environment. In this way, IgS-wBSRM can correct the uneven classification problem between classes and the off-line constant problem in traditional monitoring approach wBSRM. Moreover, considering the timeliness of the training sample data, IgS-wBSRM combines sliding window mechanism to update the weights of each impact factor combination, so that it can eliminate the impact on the recent service running state that the accumulated historical data bring. The experiment results under a real world QoS Web service data set demonstrate that with the sliding window mechanism, IgS-wBSRM can abandon the expiration information of historical data effectively, and the dynamic weighting method combined with sliding window mechanism and information gain can monitor the QoS more accurately. The overall monitoring effect is markedly better than existing QoS monitoring approaches.

    • Method of Minimality-Checking of Candidate Solution for Minimal Hitting Set Algorithm

      2018, 29(12):3733-3746. DOI: 10.13328/j.cnki.jos.005311 CSTR:

      Abstract (1781) HTML (1577) PDF 1.73 M (3234) Comment (0) Favorites

      Abstract:Minimal hitting set (MHS) problem is an important problem with widely applications in artificial intelligence. During computing all minimal hitting set, how to check the minimality of hitting sets is a key issue. Most of exhaustive MHS algorithms ensure the minimality of results by using subset checking method where its effectiveness is mainly relevant to the scale of minimal hitting sets, i.e., the larger the scale of the MHSs is, the lower the effectiveness of this method has. Consequently, when coming to solve the massive cases, the effectiveness of these algorithms is not high. To overcome this problem, this paper proposes independent coverage checking (ICC) and then develops it into an incremental method named ⅡCC. The effectiveness of this method is irrelevant to the scale of minimal hitting sets. Moreover, when computing all MHS by the incremental MHS algorithm with ⅡCC, it can incrementally test the minimality of candidate solutions, resulting in cutting down many unnecessary non-minimal hitting sets. ⅡCC is used to optimize the Boolean algorithm which is an incremental MHS algorithm using subset checking. The results show that ⅡCC method significantly outperforms the subset checking methods in terms of running time.

    • User Rating Prediction Based on Trust-Driven Probabilistic Matrix Factorization

      2018, 29(12):3747-3763. DOI: 10.13328/j.cnki.jos.005322 CSTR:

      Abstract (2158) HTML (1330) PDF 1.97 M (3310) Comment (0) Favorites

      Abstract:The development of Internet has brought convenience to the public, but also troubles users in making choices among enormous data. Thus, recommender systems based on user understanding are urgently in need. Different from the traditional techniques that usually focus on individual users, the social-based recommender systems perform better with integrating social influence modeling to achieve more accurate user profiling. However, current works usually generalize influence in simple mode, while deep discussions on intrinsic mechanism have been largely ignored. To solve this problem, this paper studies the social influence within users who affects both rating and user attributes, and then proposes a novel trust-driven PMF (TPMF) algorithm to merge these two mechanisms. Furthermore, to deal with the task that different user should have personalized parameters, the study clusters users according to rating correlation and then maps them to corresponding weights, thereby achieving the personalized selection of users' model parameters. Comprehensive experiments on open data sets validate that TPMF and its derivation algorithm can effectively predict users' rating compared with several state of the art baselines, which demonstrates the capability of the presented influence mechanism and technical framework.

    • Batch Edge-Removal Clustering Based on Random kNN Graph

      2018, 29(12):3764-3785. DOI: 10.13328/j.cnki.jos.005327 CSTR:

      Abstract (1854) HTML (1205) PDF 3.01 M (4300) Comment (0) Favorites

      Abstract:By generalizing batch edge-removal clustering algorithm, the clustering problem can be separated into the deterministic problem of edge-remove and the construction problem of adjacent graph. Firstly, in this paper, an edge-removal criterion is proposed according to the shifting under the local Gaussian smoothing of data objects. Secondly, the properties of adjacent graph suitable for clustering are studied, and a random kNN (RkNN) graph is suggested by introducing the random factor into the kNN graph. A proof is given to show that RkNN graph can lead to the enhancement of the local connectivity of graph and less dependency between clustering results and the removal of certain edges. The RkNNClus is simple and efficient without specifying the number of clusters. The experiments on synthetic datasets and real datasets demonstrate the effectiveness of the method.

    • Manifold Dimensional Reduction Algorithm Based on Tangent Space Discriminant Learning

      2018, 29(12):3786-3798. DOI: 10.13328/j.cnki.jos.005392 CSTR:

      Abstract (2058) HTML (955) PDF 1.45 M (3644) Comment (0) Favorites

      Abstract:Some good dimensional reduction algorithms based on image set have been developed. The core of these algorithms is performing a geometry-aware dimensionality reduction from the original manifold to a lower-dimensional, more discriminative manifold. Projection Metric Learning is a dimensional reduction algorithm that is based on Grassmann manifold. This algorithm, which is based on projection metric and RCG algorithm, has achieved better results on some benchmark datasets, but for some complicated face datasets, such as YTC, it has just obtained 66.69% classification accuracy. However, RCG algorithm has a poor performance of time efficiency. Based on the above reasons, a dimensional reduction algorithm based on the tangent space discriminant learning is presented. Firstly, perturbation is added to the projection matrix of PML to make it be a SPD matrix. Secondly LEM is adopted to map the element which lies on the SPD manifold to a tangent space, and then the iterative optimization algorithm based on eigen-decomposition is applied to find the discriminant function to obtain the transformation matrix. The experimental results on several standard datasets show the superiority of the proposed algorithm over other state-of-the-art algorithms.

    • Parallel Pattern Matching Algorithm with Sparse Gap Constraint

      2018, 29(12):3799-3819. DOI: 10.13328/j.cnki.jos.005326 CSTR:

      Abstract (1839) HTML (1283) PDF 2.42 M (3262) Comment (0) Favorites

      Abstract:Pattern matching with wildcards is a classic problem, and matching with variable gap constraints is a popular direction in this field in recent years. In order to meet the requirement of high accuracy in some query applications, this paper proposes an algorithm (referred to as SGPM-SAI) to obtain a complete solution of pattern matching under the condition of sparse gaps constraint. SGPM-SAI firstly creates an index structure called W-SAM (wildcard suffix automation) for the preprocessed text, and then get EndPos collection for each pattern segmentation by searching string from W-SAM, and finally get the complete solution of pattern matching by means of EndPos sets intersection. Experimental results show that, regardless of pretreatment time, the performance of SGPM-SAI algorithm is at least 3~5 times higher than other competitive algorithms, such as KMP, BM, AC, suffix array. Compared with the latest version (SAIL-Gen) of SAIL algorithm, the performance of SGPM-SAI is significantly better under the condition of sparse gaps constraint. In addition, this paper introduces parallel process methods for SGPM-SAI algorithm so as to effectively utilize the massive parallel processing units of modern processors. Experimental results show that the acceleration of Parallel SGPM-SAI algorithm has significant effect, as well as good parallel scalability. This indicates that the presented method can take full advantage of the high parallel computation capability of modern many-core processors.

    • Direct Anonymous Authentication Protocol for Roaming Services Based on Fuzzy Identity

      2018, 29(12):3820-3836. DOI: 10.13328/j.cnki.jos.005302 CSTR:

      Abstract (1805) HTML (1464) PDF 1.92 M (3177) Comment (0) Favorites

      Abstract:To provide secure roaming services for mobile users in global mobility networks, many anonymous authentication protocols have been proposed in recent years. But most of them focus only on authentication and fail to satisfy many practical security requirements. In order to achieve anonymity, the traditional anonymous roaming protocols depend on a temporary identity instead of real identity. However, these schemes have storage, communication and computing overheads due to the update operations. To overcome the shortcomings mentioned above, this paper proposes a fuzzy direct anonymous roaming mechanism for global mobility networks, in which the roaming users can fulfill the legitimacy authentication of their identity through one round message exchange with FA. This mechanism not only achieves the legitimate authentication of anonymous identity through fuzzy identity, but also avoids the update operations to get the property of "one at a time" of temporary identity in the process of roaming. Additionally, a security proof shows that this mechanism is provably secure in the CK security model. Moreover, comparative analysis shows that the presented proposal has stronger security, achieves stronger anonymity, and has lower storage, communication and computing overheads. Compared with the traditional anonymous roaming mechanism, the mechanism proposed in this paper is more suitable for the global mobility networks.

    • Database Authentication Watermarking Algorithm in Order Preserving Encrypted Domain

      2018, 29(12):3837-3852. DOI: 10.13328/j.cnki.jos.005303 CSTR:

      Abstract (2037) HTML (1559) PDF 1.74 M (3785) Comment (0) Favorites

      Abstract:Digital watermarking in encrypted domain is a potential technology for privacy protection (with encryption) and integrity authentication (with watermarking) in cloud computing environments. Based on order-preserving encryption scheme (OPES), discrete cosine transformation (DCT), cryptography hash and watermarking technologies, this paper proposes a new database authentication watermarking algorithm in encrypted domain. Firstly, data in a database are encrypted with OPES for privacy protection. Then, the encrypted data are divided into groups for DCT operations. The watermark bits generated by hashing AC coefficients are embedded into DC coefficients for authenticating the encrypted data. The receiver can determine whether the data have been tampered by matching the hash value of AC coefficients and the extracted watermark bits from DC coefficients. The watermark embedding process in encrypted domain is lossless to plaintext data by exploring order-preserving property of OPES. In the receiver, an illegal user can recover the original database by directly decrypting the watermarked ciphertext data. Experimental results have shown that the algorithm can efficiently detect different tampering operations while protecting data content privacy with the encryption.

    • Method for BGP-LDoS Attack Detection of Inter Domain Routing System Based on the Theory of Catastrophe Equilibrium State

      2018, 29(12):3853-3867. DOI: 10.13328/j.cnki.jos.005310 CSTR:

      Abstract (1902) HTML (1491) PDF 1.88 M (3308) Comment (0) Favorites

      Abstract:Inter domain routing system is a key infrastructure for the Internet. A large-scale low rate denial of service attack against BGP sessions (BGP-LDoS) can trigger a wild range of cascading failure and cause the overall paralysis of inter domain routing system. Unfortunately, the existing protection mechanisms and detection methods are not effective in detecting this type of threat originated from the system's data plane. To tackle the issue, this paper analyzes the inter domain state catastrophe process under BGP-LDoS attack, and then proposes a BGP-LDoS attack detection method based on the equilibrium state of the catastrophe theory (ESCT). Flow periodic characteristics, routing session characteristics and system forwarding packets are chosen as the detection characteristics. Based on the detection characteristics, the catastrophe model is selected and the state variables and control variables are determined. Using the collected historical data as training samples, the catastrophe function is trained in order to establish the normal and abnormal state of the equilibrium surface. Using the trained cusp catastrophe model to monitor the running state of the system, the detection of the attack is realized by utilizing the bifurcation set function to judge whether the system will jump from normal to failure. The experimental results show that this method can achieve good detection capability while only monitoring a few links and nodes. It can also provide a reliable reference for the network administrator to detect and respond to attacks in advance.

    • Energy Starvation Avoidance Mobile Charging for Wireless Rechargeable Sensor Networks

      2018, 29(12):3868-3885. DOI: 10.13328/j.cnki.jos.005315 CSTR:

      Abstract (4191) HTML (1410) PDF 1.90 M (5814) Comment (0) Favorites

      Abstract:In wireless rechargeable sensor networks (WRSN), it is very challenging to schedule the mobile charger (MC) to replenish energy for sensor nodes timely to avoid node energy starvation in the charging process, and reduce the charging cost of MC and the average charging delay. However, most of existing WRSN mobile energy replenishment schemes either cannot adapt to the dynamic and diversity energy consumption of sensor nodes in actual environment or leave out of consideration of the timeliness and fairness of charging response, which may result in both sensor node failure due to energy starvation and low charging performance. The node energy starvation issue will be worse when there is a large number of request nodes in the networks. This paper explores the energy starvation issue in mobile charging for WRSN and proposes an energy starvation avoidance online charging scheme (ESAOC). To avoid node energy starvation, ESAOC first calculates the current energy consumption rate of each node based on both its history statistics and real time energy consumption. Then, to each request node, ESAOC calculates the maximum tolerable charging delay and its shortest waiting time for charging under the assumption that only one node would be selected as the next charging node. By comparing these two values, it always chooses the nodes which make the least number of other request nodes that could suffer from energy starvation as the charging candidates. Simulation results demonstrate ESAOC can effectively solve the energy starvation problem with lower charging latency and charging cost in comparison with other existing online charging schemes.

    • Robust Delay-Aware Routing Protocol

      2018, 29(12):3886-3903. DOI: 10.13328/j.cnki.jos.005328 CSTR:

      Abstract (2068) HTML (1278) PDF 1.95 M (3347) Comment (0) Favorites

      Abstract:Wireless sensor networks are widely used in all aspects of human social life, such as disaster warning and industrial control. Many such applications need to meet not only high real-time data requirements, but also robustness of the requirements. The main challenge for this type of problem is to balance performance and robustness. In order to solve this problem, this article proposes a low latency and high robust routing protocol (RDR). RDR finds a shorter delay and strong resistance path through unique path detection to reduce the propagation delay. It also uses the surrounding nodes to assist in forwarding to solve the unstable link and enhance the robustness. As a result, the nodes in the network can maintain good delay performance in both good network environment and bad network environment, and also communicate with each other instead of limiting to one-way data collection. The paper uses OMNeT++ platform to carry out a large number of simulation experiments around delays and robustness. Theoretical analysis and simulation experiments show that RDR can guarantee lower end-to-end delay and strong robustness.

    • Efficient Implementation Method for LFA

      2018, 29(12):3904-3920. DOI: 10.13328/j.cnki.jos.005426 CSTR:

      Abstract (2963) HTML (2003) PDF 1.69 M (4792) Comment (0) Favorites

      Abstract:Lots of related researches have shown that network failures occur inevitably and frequently on the Internet. When network failures occur, the currently deployed intra-domain routing protocol need to re-convergent. During the process of re-convergence, the packets may be lost due to inconsistent routing information, thus greatly reducing the Internet routing availability. LFA was proposed to cope with all the single failure scenarios. However, the existing LFA implementation algorithms are time-consuming and require a large amount of router CPU resources. This paper proves that when a single fault occurs in a network, it only needs to calculate the backup next hop for a specific node. The backup next hop of all the other affected nodes by the fault is the same as that of the specific node. Based on the above property, the paper proposes two routing protection algorithms to provide protection for networks with symmetric and asymmetric. The results show that these approaches reduce more than 90% computation overhead compared to LFA, and achieve less than 15% path stretch. Moreover, they can provide comparable protection ratio with LFA.

    • General SpMV Implementation in Many-Core Domestic Sunway 26010 Processor

      2018, 29(12):3921-3932. DOI: 10.13328/j.cnki.jos.005309 CSTR:

      Abstract (2340) HTML (1481) PDF 1.44 M (5847) Comment (0) Favorites

      Abstract:The fastest supercomputer in the world-Sunway TaihuLight with performance of more than 100P has been released. It makes use of heterogeneous many-core processors which is different from the existing pure CPU, CPU-MIC, CPU-GPU architecture. Each processor has 4 core groups (CGs), with each including one management processing element (MPE) and one computing processing element (CPE) cluster of 64 CPEs. The peak performance of single processor is 3TFlops/s, the memory bandwidth is 130GB/s. Sparse matrix-vector multiplication is a very important kernel in scientific and engineering computing, which is bandwidth limited and subject to indirect memory access. Implementing an efficient SpMV kernel is a big challenge in Sunway processor. This paper proposes a general SpMV heterogeneous manycore algorithm for the traditional sparse matrix storage format CSR, which divides the task and LDM space in detail, a cache mechanism of dynamic and static buffers to improve the hit rate of vector x, and a dynamic-static task scheduling method to achieve load balancing. In addition, several key factors affecting the performance of SpMV are analyzed, and adaptive optimization is carried out to further enhance the performance. Finally 16 matrix from matrix market collection are used to perform tests. The experimental results show that the algorithm achieves bandwidth of 86% and average bandwidth utilization of 47%. Compared with the implementation of the controller core, the speedup can be up to 10x, and average speedup is 6.51x.

Current Issue


Volume , No.

Table of Contents

Archive

Volume

Issue

联系方式
  • 《Journal of Software 》
  • 主办单位:Institute of Software, CAS, China
  • 邮编:100190
  • 电话:010-62562563
  • 电子邮箱:jos@iscas.ac.cn
  • 网址:https://www.jos.org.cn
  • 刊号:ISSN 1000-9825
  •           CN 11-2560/TP
  • 国内定价:70元
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063