In order to integrate and query irregular and dynamic information on WEB in a database-like fashion, the authors use object exchange model (OEM) to construct information model of WEB in this paper. To express each component of pages as an OEM object, the authors design an algorithm which extracts semi-structured data from HTML pages, and the testing results are given. This method can extract structured and semi-structured data. It has better applicability than other existing methods.