Abstract:In the integrated service platform composed of multiple industry cloud service platforms, with the increasing of the number of cloud service platforms and theirs tenants, the scale of its underlying cloud workflow model repository will be increasing. When the scale of the cloud workflow model repository is super large, the existing retrieval methods of large-scale process model repositories still can't meet the needs of efficient retrieval of cloud workflow model repositories, therefore, it is necessary to study a more efficient parallel retrieval method. To address this issue, this paper adopts two data partitioning modes, equipartition and clustering based partitioning, to divide large-scale cloud workflow model repositories into small pieces. Combined with the improved process retrieval algorithm proposed in authors' previous work, a series of data partitioning based process parallel retrieval approaches are put forward to accelerate the large-scale process retrieval. These approaches mainly include four kinds of process retrieval algorithms from static/dynamic parallel retrieval algorithm based on uniform/automatic clustering partitioning model sets. Finally, based on the large-scale simulation process model library and the actual cloud workflow model repository, experiments are conducted to evaluate the efficiency of four parallel retrieval algorithms.