一种基于主集分割的基因芯片聚类算法

微信服务号

微信订阅号

2025年4月25日 2:51 星期五

首页 > 过刊浏览>2005年第16卷第9期 >1591-1598

一种基于主集分割的基因芯片聚类算法
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        滕莉滕莉
复旦大学,计算机科学与工程系,上海,200433
在期刊界中查找
在百度中查找
在本站中查找
付旭平付旭平
复旦大学,生命科学学院,遗传研究所,上海,200433
在期刊界中查找
在百度中查找
在本站中查找
李宏宇李宏宇
复旦大学,计算机科学与工程系,上海,200433
在期刊界中查找
在百度中查找
在本站中查找
李瑶李瑶
复旦大学,生命科学学院,遗传研究所,上海,200433
在期刊界中查找
在百度中查找
在本站中查找
陈文斌陈文斌
复旦大学,数学系,上海,200433
在期刊界中查找
在百度中查找
在本站中查找
李荣宇李荣宇
上海博星基因芯片有限责任公司,上海,200092
在期刊界中查找
在百度中查找
在本站中查找
沈一帆沈一帆
复旦大学,计算机科学与工程系,上海,200433
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:Supported by the Nationl Natural Science Foundation of China undder Grant No.60473104(国家自然科学基金)

A Microarray Cluster Algorithm Based on Dominant Set Segmentation

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [15]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

聚类算法广泛应用于生物芯片数据分析中,用于寻找表达相似的基因或样本.大多数已有算法都需要人为地给出一些参数,然而在没有先验知识的情况下,人为地确定这些参数是十分困难的.为了解决这一难题,提出了一种迭代的聚类算法,首先用主集方法对原有基因进行重新排序,使高度相似的基因排列在特定区域.类的分割界线通常难于确定.提出一种标准,根据类内元素间的距离远小于类外元素间的距离的性质,从排序后的数据集中划分出一个类.将找到的类从当前数据集中排除以后,对剩下的数据重复以上处理,直到满足所提出的徨停止条件为止.从多方面分析了

关键词:基因芯片;主集;聚类;相关表达;排序

Abstract:

Clustering algorithms are wildly used in the research of microarray data to extract groups of genes or samples that are tightly coexpressed. In most of them, some parameters should be predefined artificially, however, it is very difficult to determine them manually without prior domain knowledge. To handle this problem, an iterative clustering algorithm is proposed. Firstly, by sorting the original data by dominant set, similar genes would be aligned together. It’s hard to specify the cluster boundary. A criterion is presented to partition a cluster from the sorted data according to the property that the distances between the inside elements are smaller than that of outside elements. The idea is to remove the cluster form the current data set, repeat the process, and stop the algorithm when the stop criterions are satisfied. The new clustering algorithm is analyzed on several aspects and tested on the published yeast cell-cycle microarray data. The results of the application confirm that the method is very applicable, efficient and has good ability to resist noise.

Key words:microarray;dominant set;clustering;coexpressed;sorting

参考文献

[1]Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Gloub TR. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. of the National Academy of Sciences, USA, 1999,96:2907-2912.

[2]Carr DB, Somogyi R, Michaels G. Templates for looking at gene expression clustering. Statistical Computing & Statistical Graphics Newsletter, 1997,8:20-29.

[3]Eisen MB, Spellman PT, Brown PO, Bottstein D. Cluster analysis and display of genome-wide expression patterns. Proc. of the National Academy of Sciences, USA, 1998,95:14863-14868.

[4]Herrero J, Valencia A, Dopazo J. A hierarchical unsupervised growing neural network for clustering gene expression patters.Bioinformatics, 2001,17:126-136.

[5]Tavazoie S, Hughes JD, Campbell MJ, Cho R J, Church GM. Systematic determination of genetic network architecture. Nature Genetics, 1999,22:281-285.

[6]Lukashin AV, Fuchs R. Analysis of temporal gene expression profiles: Clustering by simulated annealing and determining the optimal number of clusters. Bioinformatics, 2001,17(5):405-414.

[7]Ben-Dor A, Yakhini Z. Clustering gene expression patterns. Journal of Computational Biology, 1999,6:281-297.

[8]Heyer LJ, Kruglyak S, Yooseph S. Exploring expression data: identification and analysis of coexpressed genes. Genome Research,1999,9(11): 1106-1115.

[9]de Risi JL, Iyer VR, Brown PO. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 1997,278:680-686.

[10]Lander ES. Array of hope. Nature Genetics, 1999,21:3-4.

[11]Schena M, Shalon D, Davis R, Brown P. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995,270:467-470.

[12]Sherlock G. Analysis of large-scale gene expression data. BriefBioinformatics, 2001,2(4):350-362.

[13]Pavan M, Pelillo M. A new graph-theoretic approach to clustering and segmentation. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Computer Society, 2003.98-104. http:∥www.dsi.unive.it/～pelillo/papers/cvpr03.pdf

[14]Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis,RW. A genome wide transcriptional analysis of the mitotic cell cycle. Molecular Cell, 1998,2(1):65-73.

[15]Getz G, Levin E, Domany E, Zhang MQ. Super-Paramagnetic clustering of yeast gene expression profiles. Physics A, 2000,279:457-464.

引用本文

滕莉,付旭平,李宏宇,李瑶,陈文斌,李荣宇,沈一帆.一种基于主集分割的基因芯片聚类算法.软件学报,2005,16(9):1591-1598

复制

文章指标

点击次数:4595
下载次数: 7676
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2004-05-31
最后修改日期:2005-02-04
录用日期:
在线发布日期:
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码