Abstract:SPMD translation compiles programs of one SPMD-threaded programming model to multi devices. The current researches base on the supposition that different threads are independent except in communication with explicit synchronizations. However, the data dependence relation between threads such as implicit synchronizations results in the correctness pitfalls in SPMD translation. In order to deal with implicit synchronizations, the implicit synchronizations in fine-grained SPMD programming model CUDA are analyzed systematically. The correctness pitfalls in existing SPMD translation from CUDA to Multi-core are revealed in which this paper proposes a method of detecting implicit synchronizations based on dependence analysis. On the basis of implicit synchronizations detecting, an optimized treatment algorithm is designed to treat explicit and implicit synchronizations synthetically by the loop reorder. The experimental results show that compared with existing SPMD translation, the detecting and optimized algorithm could treat kinds of implicit synchronizations in fine grained SPMD translation correctly and quickly by small expense, which helps compiler produces correct and efficient result.