Abstract:In recent years, eMule network, a kind of peer-to-peer (P2P) file-sharing network has become more and more popular. Along with its popularity, the demand to accurately determine the peer in eMule has also increased for two reasons: it is a critical step to accurately locate sources of files in P2P file-sharing networks, and the wanton spread of vulgar content makes it necessary to censor eMule. This demand allows everyone to put forward the problem of optimal peer identifier in eMule network. However, since Kad ID (the widely-used identifier in eMule network) can be freely changed by users of eMule, there exists Kad ID aliasing, a single peer may correspond to multiple Kad IDs; reversely, There also exists Kad ID repetition, which are multiple peers corresponding with a single Kad ID. Therefore, it is difficult to accurately determine the peer by using Kad ID. This paper attempts to solve this problem. First, the stability factor (SF) of peer identifier is defined to evaluate candidate identifiers. Then, a crawler named Rainbow is designed and implemented to collect peer information from multiple candidate identifiers’ relationship in real eMule network. Note that Rainbow has been proved to be convergent and has low time and space complexity. Experimental results show that {userID} is the optimal peer identifier in peer identifier set 2{Kad ID,userID,IP}-{Φ} as {userID} has the largest SF value. Later on, in order to quantify the extent of Kad ID aliasing, the relationship between {userID} and {Kad ID} is discussed. Lastly, the effectiveness of the application of the optimal peer identifier is analyzed. Results show that peers are more accurately determined when using {userID} as the identifier of peers. All in all, the identification of optimal peer identifier provides a basis for future research of eMule network, and Rainbow serves as a useful tool for measuring real eMule network.