Abstract:On the basis of the traditional methods extracting information, this paper defines the formal model ofentity activity based on case grammar and presents a method based on supported vector machine and extendedcondition random fields to extract Web entity activities accurately. First, in order to automatically train the machinelearning models, the study puts forward a heuristic method to transform the semantic role labeling training data intothe training data of entity activity extraction. Next, the study trains a support vector machine classifier and extendscondition random fields using the training data. Third, using the classifier, the study distinguishes the sentences thatcontain Web entity activities. The paper also proposes forward and extends condition random fields to model thefrequency and relationship feature. The traditional conditional random fields cannot model this while the new modelcan label the entity activity information in natural language sentences more accurately. Finally, the experimentalresults show that the method is effective in multidomains and can be applied to Web entity activity extraction.