- 1、本文档共7页,可阅读全部内容。
- 2、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
An approach to protein name extraction using heuristics and a dictionary
An Approach to Protein Name Extraction using Heuristics and a
Dictionary
Kazuhiro Seki
Laboratory of Applied Informatics Research, Indiana University, 1320 East Tenth Street, LI 011, Bloom-
ington, Indiana 47405-3907. Email: kseki@
Javed Mostafa
Laboratory of Applied Informatics Research, Indiana University, 1320 East Tenth Street, LI 011, Bloom-
ington, Indiana 47405-3907, Email: jm@
This paper proposes a method for protein name ex-
traction from biological texts. Our method exploits
hand-crafted rules based on heuristics and a set of
protein names (dictionary). In contrast to previously
proposed methods, our approach avoids the use of
natural language processing tools such as part-of-
speech taggers and syntactic parsers so as to improve
processing speed. We implemented a prototype sys-
tem for protein name extraction based on our method
and conducted evaluation experiments. The result
showed that our system produces results comparable
to the state-of-the-art protein name extraction sys-
tem on multiple corpora.
Introduction
Ever-growing digitized texts have resulted in a demand
for automated techniques to extract novel information.
Message Understanding Conferences (MUCs) (Grish-
man and Sundheim, 1996) represent one of the major
attempts to develop information extraction (IE) tech-
niques targeting general texts (newswire articles) in
which the participants independently implement IE sys-
tems and compare their system performance on a com-
mon test set.
IE is crucial and urgent also in the field of molecu-
lar biology because of a demand for automatically dis-
covering molecular pathways and interactions in the
literature, which is, even for human experts, labor-
intensive and time-consuming. Therefore, much re-
search has been done to explore IE techniques on bi-
ological texts (Friedman et al., 2001; Ng and Wong,
1999; Proux et al., 1998; Sekimizu et al., 1998; Thomas
et al., 2000).
Our ultimate goal is to realize an automated sys-
tem to discover information
您可能关注的文档
- 4AU2Can youswimP1.ppt
- 4-非ITP性血小板减少症-王兆钺.pdf
- 4module3语法.ppt
- 4radical-chain initiation 3.pdf
- 4PX业务系统网上操作指南.pdf
- 4-6-焊点质量评定及IPC-A-610C(D)介绍-(下).pdf
- 4Q Overview.pdf
- 4_KAIST_Analysis_of_Return_Current_Path_for_Chip_Package_PCB.pdf
- 5-2005-Fretting behavior of cortical bone against titanium and its alloy.pdf
- 4种植物对重金属铅、镉和砷污染土壤的修复作用.pdf
文档评论(0)