Search of Genes with Similar Phenotype Based on Disease Information Network

doi:10.13328/j.cnki.jos.005445

微信服务号

微信订阅号

2025-6-5- 8

Home > Archive>Volume 29, Issue 3, 2018 >721-733. DOI:10.13328/j.cnki.jos.005445

PDF HTML XML Export Cite reminder

Search of Genes with Similar Phenotype Based on Disease Information Network
DOI:
                        10.13328/j.cnki.jos.005445
                    
Author:
                        HOU Yong-XuHOU Yong-Xu
College of Computer Science, Sichuan University, Chengdu 610065, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
DUAN LeiDUAN Lei
College of Computer Science, Sichuan University, Chengdu 610065, China;West China School of Public Health, Sichuan University, Chengdu 610041, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI LingLI Ling
College of Life Sciences, Sichuan University, Chengdu 610041, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LU LiLU Li
College of Computer Science, Sichuan University, Chengdu 610065, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
TANG Chang-JieTANG Chang-Jie
College of Computer Science, Sichuan University, Chengdu 610065, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP311
Fund Project:National Natural Science Foundation of China (61572332, 81473446);China Postdoctoral Science Foundation (2016T90850);Fundamental Research Funds for the Central Universities (2016SCU04A22)

Article

Figures

Metrics

Reference [26]

Related [20]

Cited by

Materials

Comments

Abstract:

The results of Human Genome Project promote the development of bioinformatics. Searching disease genes that have function correlations, also called similar phenotype genes, based on the strategy of disease phenome similarity becomes an emerging research topic due to its important research value and wide range of applications. However, in biomedical field, there is no previous work that applies computer methods to search similar phenotype genes via a network consists of "gene-disease-phenotype" relations. To fill the gap, in this study, a disease information network containing three heterogeneous nodes (i.e., gene, disease, and phenotype) is built by making use of a disease open database. In addition, an algorithm, called gSim-Miner, is designed for the search of similar phenotype genes via the disease information network. Pruning strategies based on the characteristics of disease phenotype data are proposed to improve the efficiency of gSim-Miner. Experiments on real-world data sets demonstrate that the disease information network is feasible, and gSim-Miner is effective, efficient and extensible.

Key words:phenotype similarity;search of similar genes;disease information network;gSim-Miner

Reference

[1] Freimer N, Sabatti C.The human phenome project. Nature Genetics, 2003,34(1):15-21.[doi:10.1038/ng0503-15]

[2] Oetting WS, Robinson PN, Greenblatt MS, Cotton RG, Beck T, Carey JC, Doelken SC, Girdea M, Groza T, Hamilton CM, Hamosh A, Kerner B, MacArthur JA, Maglott DR, Mons B, Rehm HL, Schofield PN, Searle BA, Smedley D, Smith CL, Bernstein IT, Zankl A, Zhao EY. Getting ready for the human phenome project:The 2012 forum of the human variome project. Human Mutation, 2013, 34(4):661-6.[doi:10.1002/humu.22293]

[3] Mckusick VA. Mendelian inheritance in man and its online version, OMIM. American Journal of Human Genetics, 2007,80(4):588-604.[doi:10.1086/514346]

[4] Sun YZ, Han JW. Mining Heterogeneous Information Networks:Principles and Methodologies. Morgan & Claypool Publishers, 2012.[doi:10.2200/S00433ED1V01Y201207DMK005]

[5] Sun YZ, Han JW, Zhao PX, Yin ZJ, Cheng H, Wu TY. RankClus:Integrating clustering with ranking for heterogeneous information network analysis. In:Proc. of the 12th Int'l Conf. on Extending Data Base Technology. 2009. 565.[doi:10.1145/1516360.1516426]

[6] Sun YZ, Yu Y, Han JW. Ranking-Based clustering of heterogeneous information networks with star network schema. In:Proc. of the Int'l Conf. on Knowledge Discovery and Data Mining. 2009. 797-806.[doi:10.1145/1557019.1557107]

[7] Ji M, Sun YZ, Danilevsky M, Han JW, Gao J. Graph regularized transductive classification on heterogeneous information networks. In:Proc. of the European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Database. 2010.[doi:10.1007/978-3-642-15880-3_42]

[8] Sun YZ, Han JW, Yan XF, Yu PS, Wu TY. PathSim:Meta path-based top-K similarity search in heterogeneous information networks. Proc. of the VLDB Endowment, 2011,4(11):992-1003.[doi:10.2200/S00433ED1V01Y201207DMK005]

[9] Sun YZ, Aggarwal CC, Han JW. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. Proc. of the VLDB Endowment, 2012,5(5):394-405.[doi:10.14778/2140436.2140437]

[10] Huang Z, Zheng Y, Cheng R, Sun YZ, Mamoulis N, Li X. Meta structure:Computing relevance in large heterogeneous information networks. In:Proc. of the ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2016. 1595-1604.[doi:10.1145/2939672.2939815]

[11] Digital bibliography & library project. 2017. http://dblp.org/

[12] Chen L, Li X, Han JW. MedRank:Discovering influential medical treatments from literature by information network analysis. In:Proc. of the 24th Australasian Database Conf. Australian Computer Society. 2013. 3-12.

[13] Jeh G, Widom J. Scaling personalized Web search. In:Proc. of the Int'l Conf. on World Wide Web. 2003. 271-279.[doi:10.1145/775152.775191]

[14] Qi GJ, Aggarwal CC, Huang TS. On clustering heterogeneous social media objects with outlier links. In:Proc. of the Int'l Conf. on Web Search and Web Data Mining. 2012. 553-562.[doi:10.1145/2124295.2124363]

[15] Rossi RG, Faleiros TDP, Lopes ADA, Rezende SO. Inductive model generation for text categorization using a bipartite heterogeneous network. In:Proc. of the Int'l Conf. on Data Mining. 2012. 1086-1091.[doi:10.1109/ICDM.2012.130]

[16] Zhang J, Kong X, Jie L, Chang Y, Yu PS. NCR:A scalable network-based approach to co-ranking in question-and-answer sites. In:Proc. of the Int'l Conf. on Information and Knowledge Management. 2014. 709-718.[doi:10.1145/2661829.2661978]

[17] Ren X, Liu J, Yu X, Khandelwal U, Gu Q, Wang L, Han J. ClusCite:Effective citation recommendation by information networkbased clustering. In:Proc. of the Int'l Conf. on Knowledge Discovery and Data Mining. 2014. 821-830.[doi:10.1145/2623330. 2623630]

[18] Alkindy B, Guyeux C, Couchot JF, Salomon M, Bahi JM. Gene similarity-based approaches for determining core-genes of chloroplasts. In:Proc. of the Int'l Conf. on Bioinformatics and Biomedicine. 2015. 71-74.[doi:10.1109/BIBM.2014.6999130]

[19] Du Z, Li L, Chen CF, Yu PS, Wang JZ. G-SESAME:Web tools for GO-term-based gene similarity analysis and knowledge discovery. Nucleic Acids Research, 2009,37:W345-W349.[doi:10.1093/nar/gkp463]

[20] Sanfilippo A, Baddeley B, Beagley N, Riensche R, Gopalan B. Enhancing automatic biological pathway generation with GO-based gene similarity. In:Proc. of the Int'l Joint Conf. on Bioinformatics, Systems Biology and Intelligent Computing. 2009. 448-453.[doi:10.1109/IJCBS.2009.96]

[21] Baralis E, Bruno G, Fiori A. Measuring gene similarity by means of the classification distance. Knowledge & Information Systems, 2011,29(1):81-101.[doi:10.1007/s10115-010-0374-0]

[22] Othman RM, Deris S, Illias RM. A genetic similarity algorithm for searching the gene ontology terms and annotating anonymous protein sequences. Journal of Biomedical Informatics, 2008,41(1):65-81.[doi:10.1016/j.jbi.2007.05.010]

[23] Nagar A, Almubaid H. A new path length measure based on GO for gene similarity with evaluation using SGD pathways. In:Proc. of the Int'l Symp. on Computer-Based Medical Systems. 2008. 590-595.[doi:10.1109/CBMS.2008.27]

[24] Alvarez MA, Yan C. A graph-based semantic similarity measure for the gene ontology. Journal of Bioinformatics & Computational Biology, 2011,9(6):681-695.[doi:10.1142/S0219720011005641]

[25] Alvarez MA, Qi X, Yan C. A shortest-path graph kernel for estimating gene product semantic similarity. Journal of Biomed Semantics, 2011,2(1):1-9.[doi:10.1186/2041-1480-2-3]

[26] Webber J. A programmatic introduction to Neo4j. In:Proc.of the 3rd Annual Conf. on Systems, Programming, and Applications:Software for Humanity. 2012. 217-218.[doi:10.1145/2384716.2384777]

Get Citation

侯泳旭,段磊,李岭,卢莉,唐常杰.基于疾病信息网络的表型相似基因搜索.软件学报,2018,29(3):721-733

Copy

Article Metrics

Abstract:4085
PDF: 7379
HTML: 3041
Cited by: 0

History

Received:July 31,2017
Revised:September 05,2017
Adopted:
Online: December 05,2017
Published:

You are the first2051302Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History