2003, 14(9):1578-1585.
Abstract:In this paper, the authors attempt to revisit the behaviour of HITS from a different point of view. Namely, a similarity-based analysis model is proposed to observe the distillation procedure. By defining a generalized similarity, an algorithm is presented, which can improve the quality of distillation using only hyperlinks. A topic exploration function is also integrated into the algorithm framework, which enables end-users to search less popular topics when multi-topics are involved in queries. The experimental results reveal two benefits from the new algorithm: the improvement of distillation quality without utilizing any content information of pages, and an additional ability to explore the topics emerging in the query results.
2003, 14(10):1768-1780.
Abstract:Up to now, the World Wide Web (WWW) grows into a large hyperlinked corpus with more than 800 million pages and 5 600 million hyperlinks. Moreover, it is obviously impossible that any global ‘planning’ can be imposed on the creation of such a corpus. This brings some challenges to many research fields on the World Wide Web. On the other hand, the hyperlinked Web pages in the networking environment can be a very rich information source for daily or business use, provided people have effective means for understanding the Web. Linkage analysis is playing more and more significant role in many fields on the World Wide Web. Recent advances about the relevant research and application of linkage analysis of World Wide Web are presented in this paper. In particular, some results and achievements about linkage analysis and its applications on Web searching, Web community discovery and the Web modeling are surveyed here.