Abstract:With the increasing of information on Internet, Web mining has been the focus of data mining. Web classification predicts the labels of Web documents by learning lots of training examples with labels. It is very expensive to get these examples by manual. Web clustering groups the similar Web documents by a certain of metric of similarity. But the classical algorithms of clustering are aimless in searching the solution space and absent of semantic characters. In this paper, a semi-supervised learning strategy consists of tow stages is put forward.The fist atage,labels the documents the documents that include latent class variables by using Bayes latent semantic model.The second stage,based on the results from the first stage,labels the documents excluding latent class variables with the Naive Bayes models.Experimental results show that this algorithm has good precision and recall rate.