Abstract:Existing malware similarity measurement methods cannot accommodate code obfuscation technology and lack the ability to model the complex relationships between malware. This study proposes a malware similarity measurement method called API relation graph enhanced multiple heterogeneous proxembed (RG-MHPE) based on multiplex heterogeneous graph to solve the above problems. This method first uses the dynamic and static feature of malware to construct the multiplex heterogeneous graph and then proposes an enhanced proximity embedding method based on relational paths to solve the problem that proximity embedding cannot be applied to the similarity measurement of the multiplex heterogeneous graph. In addition, this study extracts knowledge from API documents on the MSDN website, builds an API relation graph, learns the similarity between Windows APIs, and effectively slows down the aging speed of similarity measurement models. Finally, the experimental results show that RG-MHPE has the best performance in similarity measurement performance and model anti-aging ability.