Design And Implementation Of Topic Information System Based On Web

Posted on:2012-04-29

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Jia

Full Text:PDF

GTID:2248330395956624

Subject:Software engineering

Abstract/Summary:

How to search the information user needing quickly and accurately from Web hasbecome a serious problem. To address this issue, in the field of information, topic Webming has been generated. The basic idea can be summarized as: according to topicsuser defining, with topic crawler traversing the network, collecting the pages relationto the opic ones, then pages will be collected and intelligently analyzed, finally in afriendly way to meet retrieval requirements of a specific topic.Thesis analyzes the topic of Web mining research content and current researchproblems based on the study. It will focus on three issues as follows: First, A topiccrawler algorithm has been proposed, mainly work is to strengthen the ability ofantispam, and an increase of crawler is on the topics to determine the accuracy ofcorrelation; Second, through the topic crawler algorithm improved, the pages collectedhas been analyzed and filtered. In order to facilitate research, the text filter istransformed into text classification. Due to Vector Space Model ignoring the context ofthe text information,the feature selection algorithm based on community founding hasbeen proposed to compensate for the defect in the text structure information by VectorSpace Model. Experimental results show that the classification methods are effectiveand feasible in precision, and recall. Third,to achieve automatic acquisition of topicinformation, on the basis of the previous algorithmï¼Œa topic information collectionsystem model is given based on Web.

Keywords/Search Tags:

Vector Space Model, Topical Crawlers, Community Discovery, Similarity Text Classification

Related items

1	Text Classification Based On Word Vector And Topic Vector
2	Research On Text Classification Of Web Data Mining
3	Semantic Similarity Calculation Text Field Vector Space Model
4	Text Similarity Computing Theory And Applied Research
5	Research On Rough Set Theory In Knowledge Discovery
6	Research And Implementation Of Text Classification System Based On VSM
7	The Research And Implement Of Automatic Text Classification System Which Is Based On Vector Space Model
8	Research And Implementation Of Text Similarity Algorithm Based On Semantic Fusion
9	Study Of Text Classification Model Based On Key Vector
10	Research On Topical Spider