a probabilistic model of local sequence alignment that simplifies statistical significance estimation当地的序列比对的概率模型,简化了统计显著性估计.pdfVIP
- 1、本文档共14页,可阅读全部内容。
- 2、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 5、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 6、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 7、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 8、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
a probabilistic model of local sequence alignment that simplifies statistical significance estimation当地的序列比对的概率模型,简化了统计显著性估计
A Probabilistic Model of Local Sequence Alignment That
Simplifies Statistical Significance Estimation
Sean R. Eddy*
Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia, United States of America
Abstract
Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence
alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (l) requires time-
consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that
integrate over alignment uncertainty (‘‘Forward’’ scores), but the expected distribution of Forward scores remains unknown.
Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling
methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (‘‘Viterbi’’ scores) are
Gumbel-distributed with constant l = log 2, and the high scoring tail of Forward scores is exponential with the same
constant l. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318
profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation
values (E-values) for both Viterbi and Forward scores for probabilistic local alignments.
Citation: Eddy SR (2008) A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation. PLoS Comput Biol 4(5): e1000069.
doi:10.1371/journal.pcbi.1000069
Editor: Burkhard Rost, Columbia University, United States of America
Received December 5, 2007; Accepted March 26, 2008; Published May 30, 2008
Copyright: 2008 Sean Eddy. This is an open-access article distributed under the terms of the Creative Commo
您可能关注的文档
- a multifaceted intervention to improve the quality of care of children in district hospitals in kenya a cost-effectiveness analysis多方面的干预来改善医疗服务的质量的儿童区医院在肯尼亚一个成本效益分析.pdf
- a multi-exon-skipping detection assay reveals surprising diversity of splice isoforms of spinal muscular atrophy genesmulti-exon-skipping检测试验揭示了令人惊讶的拼接亚型的脊髓性肌肉萎缩症基因的多样性.pdf
- a mouse model of acrodermatitis enteropathica loss of intestine zinc transporter zip4 (slc39a4) disrupts the stem cell niche and intestine integrity小鼠模型的肢皮炎enteropathica失去肠锌转运体zip4(slc39a4)扰乱了干细胞利基和肠道的完整性.pdf
- a multi-component model of the developing retinocollicular pathway incorporating axonal and synaptic growth的多组分的模型开发retinocollicular通路将轴突和突触的生长.pdf
- a multi-omics analysis of recombinant protein production in hek293 cellsmulti-omics分析重组蛋白生产在hek293细胞中.pdf
- a multi-population consensus genetic map reveals inconsistent marker order among maps likely attributed to structural variations in the apple genomemulti-population共识遗传图谱显示不一致的地图标记次序可能归因于苹果基因组结构的变化.pdf
- a multi-sample based method for identifying common cnvs in normal human genomic structure using high-resolution acgh data基于多试样的方法识别常见的基因拷贝数异变在正常的人类基因组结构使用高分辨率acgh数据.pdf
- a multiple-choice task with changes of mind多项选择题的任务与心灵的变化.pdf
- a multi-step pathway for the establishment of sister chromatid cohesion一个多步骤的途径建立姊妹染色单体凝聚力.pdf
- a multiscale model to investigate circadian rhythmicity of pacemaker neurons in the suprachiasmatic nucleus多尺度模型研究昼夜节律性起搏器的视交叉上核的神经元.pdf
文档评论(0)