Abstract:Graph analysis database such as MapD employs the emerging manycore architecture GPU and Phi processors to support high performance analytical processing, where the join operation is still the performance bottleneck when facing complex data schemas. In recent years, as heterogeneous processors come to be main-stream high performance computing platforms, the researches of in-memory join performance extend the focuses from multicore to the emerging manycore platforms. However those efforts have not uncover the inner relationships among join algorithm performance, join dataset size and hardware architectures, and cannot provide sufficient join selection strategies for databases under the future heterogeneous processor platforms. This paper targets in-memory join optimization techniques on multicore, Xeon Phi and GPU processor platforms. By optimizing hash table design, this work uses vector mapping instead of hash mapping to eliminate the hashing overhead effects for performance, so that the in-memory join performance characteristics influenced can be measured by hardware factors such as multicore cache size, Xeon Phi cache size, Xeon Phi simultaneous multi-threading mechanism, and GPU SIMT (single instruction multiple threads) mechanism. The experimental results show that caching and simultaneous massive-threading mechanism are key factors to improve in-memory join algorithm performance. Caching mechanism performs well for cache fit join operations, the simultaneous massive-threading mechanism of GPU does well for big table joins, and Xeon Phi achieves the highest performance for L2 cache fit joins. The experimental results also exploit the relationship between in-memory join performance and heterogeneous processor hardware features, and provide optimization policy for in-memory database query optimizer on future heterogeneous processor platforms.