Algorithms for Molecular Biology Hidden Markov Models Shamir英文电子书.pdf

Algorithms for Molecular Biology Hidden Markov Models Shamir英文电子书.pdf

  1. 1、本文档共31页,可阅读全部内容。
  2. 2、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
Algorithms for Molecular Biology Fall Semester, 2001 Lecture 5: December 13, 2001 Lecturer: Ron Shamir Scribe: Roi Yehoshua and Oren Danewitz 1 5.1 Hidden Markov Models 5.1.1 Preface: CpG islands CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand. It is known that due to biochemical considerations CpG is relatively rare in most DNA sequences [4]. However, in particular short subsequences, which are several hundreds of nucleotides long, the couple CpG is more frequent. These subsequences, called CpG islands, are known to appear in biologically more significant parts of the genome, such as around the promoters or ’start’ regions of many genes. The ability to identify these CpG islands in the DNA will therefore help us spot the more significant regions of interest along the genome. We will consider two problems involving CpG islands : First, given a short genome se- quence, decide if it comes from a CpG island or not. Second, given a long DNA sequence, locate all the CpG islands in it. 5.1.2 Reminder: Markov chains Definition A Markov chain is a triplet (Q, {p (x1 = s)}, A), where: • Q is a finite set of states. Each state corresponds to a symbol in the alphabet Σ. • p is the initial state probabilities. • A is the state transition probabilities, denoted by ast for each s, t ∈ Q. For each s, t ∈ Q the transition probability is: ast ≡ P (xi = t |xi−1 = s) (5.1) 1This scribe is partially based on Ophir Gvirtzer’s and Zohar Ganon’s scribe from Fall Semester 2000. c 2


网游加速器 + 关注


