Exploiting Social Media Information for Relational User Attribute Inference
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [20]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Inferring user attributes is important for user profiling, retrieval, and personalization. Most existing work infers user attribute independently and ignores the relations between attributes. In this work, a new method is proposed to infer user attributes via hypergraph learning. In the hypergragh, each vertex represents a user in the social media, and the hyperedges are used to capture the similarity relations of the user generated content and the relations between attributes. The user attributes inference is formalized into a regularization label similar propagation problem in the constructed hypergraph, which can effectively infer the users' various attributes. Extensive experiments conducted on a collected dataset from Google+ with full attribute annotations demonstrate the effectiveness of the proposed approach in user attribute inference.

    Reference
    [1] Garera N, Yarowsky D. Modeling latent biographic attributes in conversational genres. In:Proc. of the Joint Conf. of the 47th Annual Meeting of the ACL and the 4th Int'l Joint Conf. on Natural Language Processing of the AFNLP:Volume 2-Volume 2. Association for Computational Linguistics, 2009. 710-718.
    [2] Sarawgi R, Gajulapalli K, Choi Y. Gender attribution:Tracing stylometric evidence beyond topic and genre. In:Proc. of the 15th Conf. on Computational Natural Language Learning. Association for Computational Linguistics, 2011. 78-86.
    [3] Weber I, Castillo C. The demographics of Web search. In:Proc. of the 33rd Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. ACM, 2010. 523-530.
    [4] Zamal FA, Liu W, Ruths D. Homophily and latent attribute inference:Inferring latent attributes of twitter users from neighbors. In:Proc. of the 6th Int'l AAAI Conf. on Weblogs and Social Medis (ICWSM). 2012.
    [5] Filatova E, Prager J. Occupation inference through detection and classification of biographical activities. Data & Knowledge Engineering, 2012,76:39-57.
    [6] Tan CH, Lee L, Tang J, Jiang L, Zhou M, Li P. User-Level sentiment analysis incorporating social networks. In:Proc. of the 17th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2011. 1397-1405.
    [7] Rao D, Yarowsky D, Shreevats A, Gupta M. Classifying latent user attributes in twitter. In:Proc. of the 2nd Int'l Workshop on Search and Mining User-Generated Contents. ACM, 2010. 37-44.
    [8] Bi B, Shokouhi M, Kosinski M, Graepel T. Inferring the demographics of search users. In:Proc. of the IW3C2. 2013.
    [9] Pennacchiotti M, Popescu AM. Democrats, republicans and starbucks afficionados:User classification in twitter. In:Proc. of the 17th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM, 2011. 430-438.
    [10] Bachrach Y, Kosinski M, Graepel T, Kohli P, Stillwell D. Personality and patterns of Facebook usage. In:Proc. of the 3rd Annual ACM Web Science Conf. ACM, 2012. 24-32.
    [11] Quercia D, Kosinski M, Stillwell D, Crowcroft J. Our Twitter profiles, our selves:Predicting personality with Twitter. In:Proc. of the 3rd Int'l Conf. on Social Computing (Socialcom). IEEE, 2011. 180-185.
    [12] Magno G, Comarela G, Saez-Trumper D, Cha M, Almeida V. New kid on the block:Exploring the Google+ social graph. In:Proc. of the 2012 ACM Conf. on Internet Measurement Conf. ACM, 2012. 159-170.
    [13] Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of Machine Learning Research, 2003,3:993-1022.
    [14] Zhu J, Hoi S CH, Lyu MR, Yan S. Near-Duplicate keyframe retrieval by nonrigid image matching. In:Proc. of the 16th ACM Int'l Conf. on Multimedia. ACM, 2008. 41-50.
    [15] Ojala T, Pietikäinen M, Harwood D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition, 1996,29(1):51-59.
    [16] Torralba A, Murphy KP, Freeman WT, Rubin MA. Context-Based vision system for place and object recognition. In:Proc. of the 9th IEEE Int'l Conf. on. AI Memo 2003. IEEE, 2003. 273-280.
    [17] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In:Proc. of the Computer Vision and Pattern Recognition (CVPR 2005). IEEE Computer Society, 2005. 886-893.
    [18] Wang JJ, Yang JC, Yu K, Lü FJ, Huang T, Gong YH. Locality-Constrained linear coding for image classification. In:Proc. of the Computer Vision and Pattern Recognition (CVPR 2010). IEEE, 2010. 3360-3367.
    [19] Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ. LIBLINEAR:A library for large linear classification. Journal of Machine Learning Research, 2008,9:1871-1874.
    [20] Zhou DY, Huang JY, Schölkopf B. Learning with hypergraphs:Clustering, classification, and embedding. Advances in Neural Information Processing Systems, 2007,19:1601.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

项连城,方全,桑基韬,徐常胜,路冬媛.基于社交媒体的关联性用户属性推断.软件学报,2015,26(S2):145-154

Copy
Share
Article Metrics
  • Abstract:2146
  • PDF: 4309
  • HTML: 0
  • Cited by: 0
History
  • Received:June 20,2014
  • Revised:August 20,2014
  • Online: January 11,2016
You are the first2032507Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063