Font Size: a A A

The Design And Implementation Of Chinese Webpage Classification And Storage System

Posted on:2008-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:C L YuFull Text:PDF
GTID:2178360245496736Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Along with Internet technology being fast development and networkresources rapid inflation,in order to provide the highly effective andaccurate service to the user,we need to deal with the complicated resourcesin the network by the reasonable organization and the classification.In thenetwork information resource has the remarkable characteristics of the masscapacity,the dynamic,the isomerism,half structure and so on,as a result,thelacks of the unified organization and management appear chaotically,andhave brought difficulties for the Web retrieval.In the network informationresource has the capacity for alcohol,the tendency,the isomerism, halfstructure and so on.The remarkable characteristic,as a result,lacks of theunification the organizationand the management appear chaotically,andhave brought the certain difficulties for the Web retrieval.The usingwebpage classification technology may effectively organize and managenetwork resources,and enhance the efficiency of retrievaling information,ithas at present become one of hot spots of the network retrieval research.Because the network has the certain renewal,the network resources canbe updated every now and again,when inquiring these informations likethis,the user possibly can not search them.Now,our country has not startedto realize the importance of preserving the network resourcespermanently.The permanently preserved website content of the various timemay prevent the useful network resources renewed forever,and protect thenetwork resources,and also facilitate the user searching the websiteinformation any time.The paper use extracting,segmenting,classifying webpages anddistilling characteristics as the methods of increasedly storing webpageinformation and classifying webpages,through comprehensive analysis tothe structure of Chinese webpage and permanent storage,it constructsChinese webpage classification and storage system.This system canaccurately classify the resources which gather from the network and may carry out the incremental storage,so it may be advantageous for the users'inquiries,simultaneously,effectively has saved the storage space.This article introduced the methods of extracting,segmenting Chinesewebpage information,and distilling the characteristic.It has analysedChinese webpage structures and characteristics,and proposed the design andthe realizing methods of Chinese webpage classification and memorysystem.The test results have achieved the system design requirement,theapplication effect is remarkable.
Keywords/Search Tags:webpage classification, information extraction, segmenting, characteristics distilling, incremental storage
PDF Full Text Request
Related items