Study On Methods Of Network Information Process Based On Web Ming

Posted on:2010-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:P Liu

Full Text:PDF

GTID:2178360278457000

Subject:Management Science

Abstract/Summary:

PDF Full Text Request

Along with Network infiltrates into almost every field of society, all kinds of information be shared in the network with unprecedented extend. Amount of network data shows increase exponential. Internet has become less cost and resourceful data source, and there are multiform information under cover it. But, the search engine in being can't analyses content of Web to help us get info and located the data. It replies us lots of Web touching upon words and expressions of topic. The rest depend on us. In the thesis, a method for network information Process Based on Web Ming is put forward. The method describes as follows:(1).The study of corpus base orient to the object and eigenvector sets. The construction of corpus base shorten Vector representation process of Web text, and reduce the dimensionality of eigenvector, and improve execution efficiency of classify and clustering. On the other hand, eigenvector simplifies storage pattern. The classification is to nicety(2). That MLDB base on index is construct unify storage criterion of network data to convenient for mining analysis, getting elements of sensitive information and locating data source.(3). The studies of mining module conclude two parts. One is Classifier based on network information base is construct. It introducesχ~2 statistics arithmetic and pot out contribution of characteristic words to sorts. The one is clustering arithmetic aiming at Web data sets that introduces TF-IDF arithmetic. In the end, clustering result is put into classifier to finale of Web data sets. And offer support for distilling information...

Keywords/Search Tags:

Web mining, Corpus base, Eigenvector, MLDB, Clustering

PDF Full Text Request

Related items

1	Search Of WEB Mining And Realization Of Intelligent WEB Minning System Based On MLDB
2	Hits Algorithm On Web Data Mining Research
3	Friends And Relatives Call Circle Mining Based On Spectral Clustering
4	Selection And Integration Algorithm Of Eigenvector In Spectral Clustering
5	Research On Multiway P-spectral Clustering Algorithm Based On Self-adaptive Neighborhood
6	Research On HITS Algorithm Of Web Structure Mining
7	Research On Syntactic Knowledge Mining And Extraction Based On English-chinese Parallel Corpus
8	Research And Improvement On Automatic Construction System For Text Categorization Corpus
9	Research On Active Learning Based Automatic Corpus Annotation
10	Construction Of Chinese Email Corpus