Abstract:Most Web bibliographies cannot meet the retrieval requirements of the researchers with different academic levels. The reason resulting in the problem is analyzed, and the idea of constructing an auxiliary Web bibliography retrieval structure for the users to obtain more proper bibliographies is proposed. Based on the idea, an algorithm of mining the longest sequential frequent phrases for extracting features of the bibliographies is designed, and an extended feature hierarchical tree describing the relationship among the features, among the bibliographies, and among the features, the bibliographies and its construction is presented. The experiments show that the new method outperforms the current popular TFIDF method in extraction features. The theoretical analysis explains that the extended feature hierarchical tree has constringent structure, reveals the relationship between phrases and bibliographies, and provides better assistant retrievals.