Framework for Domain-Oriented Academic Literatures Retrieval

doi:10.3724/SP.J.1001.2013.04267

微信服务号

微信订阅号

2025-4-6- 2

Home > Archive>Volume 24, Issue 4, 2013 >798-809. DOI:10.3724/SP.J.1001.2013.04267

PDF HTML XML Export Cite reminder

Framework for Domain-Oriented Academic Literatures Retrieval
DOI:
                        10.3724/SP.J.1001.2013.04267
                    
Author:
                        QIU Jiang-TaoQIU Jiang-Tao
School of Economic Information Engineering, Southwestern University of Finance and Economics, Chengdu 610074, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
TANG Chang-JieTANG Chang-Jie
School of Computer Science, Sichuan University, Chengdu 610065, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI QingLI Qing
School of Economic Information Engineering, Southwestern University of Finance and Economics, Chengdu 610074, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [24]

Related [20]

Cited by

Materials

Comments

Abstract:

A literature retrieval system, which returns user papers domain-related with queries and ranks papers by importance, can help users quickly learn one academic domain. This paper develops a framework for the domain-oriented literature retrieval, which combines links and contents analysis to search and rank important papers in one academic domain. This framework designs a score function that evaluates both importance of the paper and its relevance to the domain. The study first proposes a community-core discovery algorithm, which is capable of finding a collection of papers domain-related with query from citation network and calculates an importance score for each paper. To assign other papers a domain-related score, a supervised non-negative matrix factorization method, using identified domain-related paper as prior knowledge, is also developed. The experiments conducted on synthetic and real datasets demonstrate the feasibility and applicability of this framework.

Key words:non-negative matrix factorization;random walk;literature retrieval;citation network;link analysis

Reference

[1] Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000,22(8):888-905. [doi: 10.1109/34.868688]

[2] Hagen L, Kahng AB. New spectral methods for ratio cut partitioning and clustering. IEEE Trans. on Computed Aided Design, 1992,11(9):1074-1085. [doi: 10.1109/43.159993]

[3] Ding C, He X, Zha H, Gu M, Simon H. A min-max cut algorithm for graph partitioning and data clustering. In: Cercone N, ed. Proc.of the 2001 IEEE Int’l Conf. on Data Mining. Washington: IEEE Computer Society, 2001. 107-114. [doi: 10.1109/ICDM.2001.989507]

[4] Newman MEJ. Fast algorithm for detecting community structure in networks. Physical Review E, 2004,69(6):66-72. [doi: 10.1103/PhysRevE.69.066133]

[5] Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Physical Review E, 2004,69(2):1-15. [doi: 10.1103/PhysRevE.69.026113]

[6] Leicht EA, Clarkson G, Shedden K, Newman MEJ. Large-Scale structure of time evolving citation networks. The EuropeanPhysical Journal B—Condensed Matter and Complex Systems, 2007,59(1):75-83. [doi: 10.1140/epjb/e2007-00271-7]

[7] Shen HW, Cheng XQ, Chen HQ, Liu Y. Information bottleneck based community detection in network. Chinese Journal ofComputers, 2008,31(4):677-686 (in Chinese with English abstract).

[8] Yang N, Lin SX, Gao Q, Meng XF. Discovering signature of potential Web communities from clusters of MCL. Chinese Journal ofComputers, 2007,30(7):1086-1093 (in Chinese with English abstract).

[9] Gan WY, He N, Li DY, Wang JM. Community discovery method in networks based on topological potential. RuanjianXuebao/Journal of Software, 2009,20(8):2241-2254 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/3318.htm [doi: 10.3724/SP.J.1001.2009.03318]

[10] Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature, 1999,401(6755):788-791. [doi: 10.1038/44565]

[11] Lin CJ. Projected gradient methods for non-negative matrix factorization. Neural Computation, 2007,19(10):2756-2779. [doi: 10.1162/neco.2007.19.10.2756]

[12] Zhu SH, Yu K, Chi Y, Gong YH. Combining content and link for classification using matrix factorization. In: Kraaij W, ed. Proc.of the 30th Annual Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval. New York: ACM Press, 2007.487-494. [doi: 10.1145/1277741.1277825]

[13] Chen YH, Rege M, Dong M, Hua J. Non-Negative matrix factorization for semi-supervised data clustering. Knowledge andInformation Systems, 2008,17(3):355-379. [doi: 10.1007/s10115-008-0134-6]

[14] Chen P, Xie H, Maslov S, Redner S. Finding scientific gems with Google’s PageRank algorithm. Journal of Infometrics, 2007,1(1):8-15. [doi: 10.1016/j.joi.2006.06.001]

[15] Ding Y, Cronin B. Popular and/or prestigious? Measures of scholarly esteem. Information Processing & Management, 2011,47(1):80-96. [doi: 10.1016/j.ipm.2010.01.002]

[16] Yan EJ, Ding Y. Discovering author impact: A PageRank perspective. Information Processing & Management, 2011,47(1):125-134. [doi: 10.1016/j.ipm.2010.05.002]

[17] Ma N, Guan JC, Zhao Y. Bringing PageRank to the citation analysis. Information Processing and Management, 2008,44(2):800-810. [doi: 10.1016/j.ipm.2007.06.006]

[18] Bolelli L, Ertekin S, Giles CL. Clustering scientific literature using sparse citation graph analysis. Lecture Notes in ComputerScience, 2006,4213:30-41. [doi: 10.1007/11871637_8]

[19] Lagoze YJC, Giles CL. Detecting research topics via the correlation between graphs and texts. In: Berkhin P, ed. Proc. of the 13thACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2007. 370-379. [doi: 10.1145/1281192.1281234]

[20] Guo Z, Zhang ZF, Zhu SH, Chi Y, Gong YH. Knowledge discovery from citation networks. In: Wu XD, ed. Proc. of the 2009 IEEEInt’l Conf. on Data Mining. Washington: IEEE Computer Society, 2009. 800-805. [doi: 10.1109/ICDM.2009.137]

[21] Yin XS, Huang JXJ, Li ZJ. Mining and modeling linkage information from citation context for improving biomedical literatureretrieval. Information Processing & Management, 2011,47(1):53-67. [doi: 10.1016/j.ipm.2010.03.010]

[22] Craswell N, Szummer M. Random walks on the click graph. In: Kraaij W, ed. Proc. of the 30th Annual Int’l ACM SIGIR Conf. onResearch and Development in Information Retrieval. New York: ACM Press, 2007. 239-246. [doi: 10.1145/1277741.1277784]

[23] Jones KS, Walker KS, Robertson SE. A probabilistic model of information retrieval: Development and comparative experiments.Information Processing & Management, 2000,36(6):779-840. [doi: 10.1016/S0306-4573(00)00015-7]

[24] Manning CD, Raghavan P, Schutze H, Wrote; Wang B, Trans. Introduction to Information Retrieval. Beijing: Post & TelecomPress, 2010. 160-161 (in Chinese).

Get Citation

邱江涛,唐常杰,李庆.面向领域的学术文献检索框架.软件学报,2013,24(4):798-809

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 05,2012
Revised:March 19,2012
Adopted:
Online: March 26,2013
Published:

You are the first2033271Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History