Abstract:A novel method of discovering relation information among entities buried in different nest structures of XML documents is proposed. The method is able to identify relations among different types of entities given by users, and extract relation instances and their occurrence patterns in XML documents. The solution is as follows: identify and collect XML fragments that contain all types of entity given by users at first, then calculate similarity between fragments based on semantics of their tags and their structures, and cluster fragments with a adaptively selected similarity threshold so that the fragments containing the same relation are clustered together, finally extract relation instances and patterns of their occurrences from each cluster. The experimental results show that the method can identify and extract relation information among given types of entities correctly from all kinds of XML documents with meaningful tags.