Abstract:Biological gene sequencing is one of the most common high-performance computing tasks in Bioinformatics analysis. This paper aims to find the main workload characteristics of biological gene sequence trace (BGST) and construct a general model to analyze the biological gene sequence (BGS), which can be used in high-performance computing scheduling and performance optimization with the BGS. The study mainly analyzes the job arrival, runtime and parallelism characteristics in BGST. Based on the analysis, it constructs several local models with exponential, Gamma, Gaussian and linear regression, then combines all the local models into a final model. The experimental results obtained by applying two general evaluation methods show that the new model has uniform distributed trend with BGST, which demonstrates the good versatility of the model.