二 〇 〇 八 年 六 月 摘 要 为了适应网络信息的飞速增长,并且能够迅速、方便地从网络中获取有效信息, 有哪些信誉好的足球投注网站引擎逐渐走进了人们的生活,“竹竹”有哪些信誉好的足球投注网站引擎系统在这样的条件下,应运而生。 本文首先系统的介绍了有哪些信誉好的足球投注网站引擎的概念、发展历史、和有哪些信誉好的足球投注网站引擎的分类。使读者能够初步了解有哪些信誉好的足球投注网站引擎技术。然后,详细介绍了“竹竹”有哪些信誉好的足球投注网站引擎系统。 “竹竹”有哪些信誉好的足球投注网站引擎是基于Web的,面向笔记本电脑品牌的有哪些信誉好的足球投注网站引擎。系统的前端以MVC模式来实现,Spring做中间层,JDBC作后端来开发实现的。本系统分为三个子模块,抓取模块实现的功能为:将web上的海量网页抓取到系统中;采用的实现方法是使用Heritrix来完成对网页的抓取。处理模块实现的功能为:解析网页,提取其中的有用内容,为网页建立词库,由于笔记本电脑的品牌名在现有词库中不存在,因此要建立其特有的词库文件,对解析网页生成的信息文件进行分词,并建立索引,将索引存入数据库中;采用的实现方法是:通过Lucene的API来实现对网页内容的建索,使用HTMLParser的API实现了对网页内容的解析。用户模块实现的主要功能是:用户模块是系统的用户接口,用户通过此模块完成与系统的交互,当用户在查询界面上输入要检索的品牌信息后,系统将在可以接受的时间内,返回用户所需的结果集;采用的实现方法是:通过DWR封装了AJAX技术,处理用户请求;通过Lucene的API来实现检索。 关键词:有哪些信誉好的足球投注网站引擎;Lucene;Heritrix Abstract In order to adapt to the rapid growth of information networks, and can quickly and easily access to information from the network, search engines gradually come into peoples lives, zhuzhu search engine system is builded in such conditions. This paper first introduced the system,the concept of search engines, the development of history, and search engines category. So that readers can understand the search engine technology. Then, details of the zhuzhu search engine system. zhuzhu search engine is a Web-based, brand-oriented notebook computer search engine. The front-end system is made by model MVC, Spring to the middle layer, JDBC for the back-end . The system is divided into three sub-module, crawl module for the realization of the functions: Massive on the web page to crawl into the system; using the method is used to running Heritrix. Processing module for the realization of the functions: Analysis of the page, which extract useful content, pages thesaurus, because the brand of notebook computers available in the thesaurus does not exist, to establish its unique lexicon documents, analysis of the page Information generated by Word documents, and index, the index will be deposited in the database; method is used: Lucene API to achieve the content of the cable construction, the use of the API HTMLParser achieve the web content analysis.


