Abstract:According to analyzing the traditional entity identification methods, a deep Web entity identification mechanism based on semantics and statistical analysis (SS-EIM) is presented in this paper, which includes text matching model, semantics analysis model and group statistics model. Also a three-phase gradual refining strategy is adopted, which includes text initial matching, representation relationship abstraction and group statistics analysis. Based on the text characteristics, semantic information and constraints, the identification result is revised continuously to improve the accuracy. By performing the self-adaptive knowledge maintenance strategy, the content of representation relationship knowledge database can be more complete and effective. The experiments demonstrate the feasibility and effectiveness of the key techniques of SS-EIM.