Abstract:Tag cloud has been a popular facility used by social networks for online resource summarization and navigation. Tag selection, which aims to select a limited number of representative tags from a large set of tags, is the core task for creating tag clouds. Diversity of tag selection result is an important factor that affects user satisfaction. Information coverage and tag dissimilarity are two major perspectives for introducing diversity in tag selection. To improve information coverage and tag dissimilarity of tag selection result, this paper proposes three new tag selection approaches. In each approach, an objective function is defined to quantify both information coverage and tag dissimilarity of tags, and an approximate algorithm is designed to solve the corresponding maximization problem. Further the approximate ratio for each approximate algorithm is analyzed. The proposed and existing approaches are compared using tagging datasets extracted from the websites of CiteULike and Last.fm. The experimental results show that the new approaches perform better in terms of both information coverage and tag dissimilarity.