Research And Implement Of Web Information Intelligence Collection And Classification

Posted on:2015-06-03

Degree:Master

Type:Thesis

Country:China

Candidate:F Liu

Full Text:PDF

GTID:2298330452994257

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of science and technology, we have entered the digitalinformation age. The Internet as the world’s largest repository of information has becomethe main means of access to information. As information resources on the network has amassive, dynamic, heterogeneous, semi-structured and other characteristics, and the lack ofa unified organization and management, so how to quickly and accurately from the mass ofinformation resources to find the information they need has become Internet users need tobe urgently addressed a major problem. Thus Web-based network information collectionand classification has become a hot research.The goal of traditional Web information collection is to gather information as much aspossible page, or even the whole resources on the Web, in this process, it is not tooconcerned about the acquisition order and related topics have been collected page. Thepage content is too messy, greatly consume system resources and cyber source. Thisrequires effective collection method used to reduce the occurrence of webpage collectiondisorderly and the repetition condition. How to solve the information be in a largeextent, and is convenient for the user to accurately locate the information they need, relyingon the artificial way to classification is unrealistic. Therefore, automatic webpageclassification is an effective means of organizing and managing information. Which is alsoan important part of this paper.This paper first introduces the topic background, research significance and the currentResearch situation,describes the main techniques and algorithms related theory,design the webpage information intelligent acquisition and classification system, the systemincludes two parts: information collection and classification. Information acquisitionpart, mainly based on the web crawler breadth first strategy algorithm based on the themeof the webpage information extraction method and rule template, the free orsemi structured data into structured data. Information classification part, according to theneeds of users, the SVM algorithm combined with the use of word segmentation andfeature extraction technology to classify information, provide a full rangeof information services for users.

Keywords/Search Tags:

Information collection, information extraction, information processing, SVM classification algorithm, information classification

PDF Full Text Request

Related items

1	Chinese BBS Information Extraction And Classification
2	Design And Implementation Of Chinese Webpage Automatic Collection And Classification
3	User Web Information Collection And Analysis System Based On The Smart Router
4	Research Of Network Information Collection And Intelligent Processing Technology
5	Research Of Web Chinese Information Intelligent Extraction And Classification
6	Research And Implement Of Web Information Intelligence Collection And Personalized Service System
7	Web Information Retrieval System Based On Classification Semantics
8	The Study And Implementation Of Web Information Extraction Mechanism Based On Classification Semantics
9	Research On The Author Style Classification And Recognition Technology Of Web Information
10	Research Of Chinese Text Classification Algorithms Based On VSM