- 1、本文档共11页,可阅读全部内容。
- 2、有哪些信誉好的足球投注网站(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
基于文本的Web图片有哪些信誉好的足球投注网站引擎的研究
摘 要
如需全文或定做各专业论文,可联系QQ 2537024709
本文研究工作是针对Web图片有哪些信誉好的足球投注网站引擎应用背景,以构建Web图片有哪些信誉好的足球投注网站引擎为目标,地研究如何从HTML文档中提取图片相关信息,保证高效和准确的实现图片检索在真实数据实验和分析的基础上提出了若干关键技术,,
本文提出的方法通过分析验证与图片相关的信息提高图片的通过统计总结HTML文件中表现出的一些潜在规律LSI算法应用于图片有哪些信誉好的足球投注网站引擎来整合文字和内容信息的方法,并通过简单实验进行了效果验证。设计了一个Web图片有哪些信誉好的足球投注网站引擎。Abstract
In the thesis, we form a scheme to design a large-scale Web image search engine system using mainly text-based technology.
We introduce and research a series of techniques related to Web image search engine, such as crawling, relevance ranking (VSM and LSI), information extraction and indexing. Those techniques will be used in our system design.
We concentrate on how to extract information relevant to images from HTML documents more effectively and precisely. According to experiments and analysis on real data, we propose several key techniques as below for designing the system:
We analyze carefully the structure of HTML components including img tag, a tag, title of web page, anchor text of web page, URL of image, meta tag, table tag, surrounding text of img tag etc. And sum up nine extraction patterns to fetch information relevant to images. We also research three extracting methods: DOM based method, String based method and Wrapper based method.
We propose some methods to filter useless images according to file size, width and height of images and referred count of images by img tags.
Through statistics of mass of HTML documents, We conclude some latent rules, such as the difference between JPG and GIF, the difference between a tag and img tag, the difference between different referred count of images.
We Simply research the application method of LSI to integrate high-level and low-level information of images.
We design and implement a text-based Web image search engine. The global structure of our system and relations of the components of system are introduced. Some components are detailed in function and implementation. Finally a simple evaluation about searching effect and performance is given.
文档评论(0)