专家证据文档识别无向图模型

doi:10.3724/SP.J.1001.2013.04480

微信服务号

微信订阅号

2025年3月30日 20:31 星期日

首页 > 过刊浏览>2013年第24卷第11期 >2734-2746. DOI:10.3724/SP.J.1001.2013.04480

PDF HTML阅读 XML下载导出引用引用提醒

专家证据文档识别无向图模型
DOI:
                        10.3724/SP.J.1001.2013.04480
                    
CSTR:
                        
                    
作者:
                        毛存礼毛存礼
昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
在期刊界中查找
在百度中查找
在本站中查找
余正涛余正涛
昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
在期刊界中查找
在百度中查找
在本站中查找
吴则建吴则建
昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
在期刊界中查找
在百度中查找
在本站中查找
郭剑毅郭剑毅
昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
在期刊界中查找
在百度中查找
在本站中查找
线岩团线岩团
昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(61175068);教育部留学回国人员启动基金;云南省教育厅科研基金重大专项;云南省软件工程重点实验室开放基金(2011SE14)

Undirected Graph Model for Expert Evidence Document Recognition

Author:

MAO Cun-Li
MAO Cun-Li
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
在期刊界中查找
在百度中查找
在本站中查找
YU Zheng-Tao
YU Zheng-Tao
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
在期刊界中查找
在百度中查找
在本站中查找
WU Ze-Jian
WU Ze-Jian
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
在期刊界中查找
在百度中查找
在本站中查找
GUO Jian-Yi
GUO Jian-Yi
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
在期刊界中查找
在百度中查找
在本站中查找
XIAN Yan-Tuan
XIAN Yan-Tuan
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [21]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

专家证据文档识别是专家检索的关键步骤.融合专家候选文档独立页面特征以及页面之间的关联关系,提出了一个专家证据文档识别无向图模型.该方法首先分析各类专家证据文档中的词、URL 链接、专家元数据等独立页面特征以及候选专家证据文档间的链接和内容等关联关系;然后将独立页面特征以及页面之间的关联关系融入到无向图中构建专家证据文档识别无向图模型;最后利用梯度下降方法学习模型中特征的权重,并利用吉布斯采样方法进行专家证据文档识别.通过对比实验验证了该方法的有效性.实验结果表明,该方法有较好的效果.

关键词:专家证据文档;专家检索;独立页面特征;专家元数据;无向图模型

Abstract:

Expert evidence document recognition is the key step for expert search. Combining specialist candidate document independent page features and correlation among pages, this paper proposes an expert evidence document recognition method based on undirected graph model. First, independent page features such as words, URL links and expert metadata in all kinds of expert evidence document, and correlations such as links and content among candidate expert evidence document are analyzed. Then, independent page features and correlation among pages are integrated into the undirected graph to construct an undirected graph model for expert evidence document recognition. Finally, feature weights are learned in the model by using the gradient descent method and expert evidence document recognition is achieved by utilizing Gibbs Sampling method. The effectiveness of the proposed method is verified by comparison experiment. The experimental results show that the proposed method has a better effect.

Key words:expert evidence document;expert search;independent page feature;expert metadata;undirected graph model

参考文献

[1] Macdonald C, Ounis I. Voting for candidates: Adapting data fusion techniques for an expert search task. In: Proc. of the 15th ACM Int''l Conf. on Information and Knowledge Management. New York: ACM Press, 2006. 387-396.[doi: 10.1145/1183614.1183671]

[2] Macdonald C, Hannah D, Ounis I. High quality expertise evidence for expert search. Lecture Notes in Computer Science, 2008, 4956:283-295.[doi: 10.1007/978-3-540-78646-7_27]

[3] Craswell N, de Vries AP, Soboroff I. Overview of the trec-2005 enterprise track. In: Proc. of the TREC 2005 Conf. New York: IEEE Press, 2005. 199-205.

[4] Xi WS, Fox EA, Tan RP, Shu J. Machine learning approach for homepage finding task. In: Proc. of the 9th Int''l Symp. on String Processing and Information Retrieval. Berlin, Heidelberg: Springer-Verlag, 2002. 145-159.[doi: 10.1007/3-540-45735-6_14]

[5] Tang J, Zhang D, Yao LM. Social network extraction of academic researchers. In: Proc. of the 17th IEEE Int''l Conf. on Data Mining (ICDM 2007). Washington: IEEE Press, 2007. 292-301.[doi: 10.1109/ICDM.2007.30]

[6] Bron M, Balog K, de Rijke M. Ranking related entities: Components and analyses. In: Proc. of the 19th ACM Int''l Conf. on Information and Knowledge Management. New York: ACM Press, 2010. 1079-1088.[doi: 10.1145/1871437.1871574]

[7] Li LN, Yu ZT, Zou JJ, Su L, Xian YT, Mao CL. Research on entity homepage recognition method. Journal of Computational Information System, 2009,5(6):1617-1624.

[8] Fang Y, Si L, Yu ZT, Xian YT, Xu YB. Entity retrieval with hierarchical relevance model. In: Proc. of the 18th Text REtrieval Conf. (TREC 2009). New York: IEEE Press, 2009.

[9] Fang Y, Si L, Mathur AP. Discriminative graphical models for faculty homepage discovery. Journal of Information Retrieval, 2010, 13(6):618-635.[doi: 10.1007/s10791-010-9127-7]

[10] Wu ZJ, Yu ZT, Su L, Liu L, Xian YT. Research on the method of expert homepage recognition based on Markov logic networks. Journal of Computational Information System, 2012,8(3):1089-1096.

[11] Macdonald C, Ounis I. Voting for candidates: Adapting data fusion techniques for an expert search task. In: Proc. of the CIKM 2006. New York: ACM Press, 2006. 387-396.[doi: 10.1145/1183614.1183671]

[12] Balog K, Azzopardi L, de Rijke M. Formal models for expert finding in enterprise corpora. In: Proc. of the SIGIR 2006. New York: ACM Press, 2006. 43-50.[doi: 10.1145/1148170.1148181]

[13] Jordan MI. Graphical models. Statistical Science, 2004,19(1):140-155.[doi: 10.1214/088342304000000026]

[14] Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge: Massachusetts Institute of Technology Press, 2009.[doi: 10.1007/978-3-642-38466-0_28]

[15] Tian W, Shen T, Yu ZT, Guo JY, Xian YT. A Chinese expert name disambiguation approach based on spectral clustering with the expert page-associated relationships. Lecture Notes in Electrical Engineering, 2013,256:2013. 245-253.

[16] Ng AY, Jordan MI, Weiss Y. On spectral clustering: Analysis and an algorithm. In: Dietterich TG, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems (NIPS) 14. Cambridge: MIT Press, 2002. 894-856.

[17] Wang L, Bo LF, Jiao LC. Density-Sensitive semi-supervised spectral clustering. Ruan Jian Xue Bao/Journal of Software, 2007, 18(10):2412-2422 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/18/2412.htm[doi: 10.1360/jos182412]

[18] Wu ZJ, Yu ZT, Guo JY, Mao CL, Zhang YM. Fusion of long distance dependency features for Chinese named entity recognition based on Markov logic networks. In: Proc. of the Natural Language Processing and Chinese Computing. Natural Language Processing and Chinese Computing Communications in Computer and Information Science, 2012,333:132-142.[doi: 10.1007/978- 3-642-34456-5_13]

[19] Luenberger DG. Optimization by Vector Space Methods. Hoboken: Wiley-Interscience, 1997.

[20] Zhang D, Lee WS. Question classification using support vector machines. In: Proc. of the 26th Annual Int''l ACM SIGIR Conf. on Research and Development in Informaion Retrieval. New York: ACM Press, 2003. 26-32.[doi: 10.1145/860435.860443]

[21] Aizawa A. An information-theoretic perspective of TF-IDF measures. Information Processing & Management, 2003,39(1):45-65.[doi: 10.1016/S0306-4573(02)00021-3]

引用本文

毛存礼,余正涛,吴则建,郭剑毅,线岩团.专家证据文档识别无向图模型.软件学报,2013,24(11):2734-2746

复制

文章指标

点击次数:5658
下载次数: 7231
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2013-05-06
最后修改日期:2013-08-02
录用日期:
在线发布日期: 2013-11-01
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码