Intelligent Service Oriented Study And Application On Web Content Computing

Posted on:2007-10-06

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y H Zhang

Full Text:PDF

GTID:1118360185951378

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

Web is now the most important way for man to acquire information and knowledge. But its hugeness, diversity, dynamics and semi-structure promote the difficulty in processing data by machine. It attracts many researchers devoting to find way to retrieve interesting information from the enormous amount Web pages, how to convert the information into knowledge and how to get individualized service from Web. Now research in web data can be roughly categorized in three fields: web content mining, web usage mining and web structure mining. Web content data is the main carrier of Internet information. It contains content data, marking or token and hyperlink. Web content based computing research focuses on web pages' content data, the hotspots includes information extraction (IE), information retrieval (IR) and intelligent web services. On the basis of survey of web content computing, this paper casts its focus on the following issues:1. Proposed an approach named Incremental FP-Growth, which can be applied in dynamic environment for mining the association rules.The data in web pages has the characteristics of semi-structure, irregularity and dynamics, and it makes web-content based data computing and mining difficult and complex. By making a survey of the theories and approaches, we proposed the iFP-Growth algorithm for the association rules mining for the web content data. And as an application in China car market, our experiments show the efficiency of association rules mining in the car consumption preference in various types, models and prices of cars.2.Proposed an model for text classification based on sentence correlation (TCSC).For the problems of text segmentation and multivocal in the research of information retrieval on classification and cluster of Chinese web document set, we present a method based on Chinese sentence to express the characteristics of Chinese text document with the help of corpus. It incrementally updates category corpus with the training documents; then calculates the sentences correlation matrix by their position weight and corpus item weight to classify documents. This model avoids the problem of word segment in Chinese documents and lowers the effect of multivocal of words in the phase of classification.

Keywords/Search Tags:

Web Content Computing, Web Mining, Web Information Extraction, Web Text Classification, Web Intelligent Service

PDF Full Text Request

Related items

1	Research On The Key Techniques Of Web Information Intelligent Acquisition
2	Research And Implementation Of Web Content Monitoring
3	Study Of Critical Analysis And Intelligent Retrieval Of Emotional Music
4	Medical Advertisments Monitoring System Based On Web Content Mining
5	Design And Implementation Of Content Farm Filtering System Based On Text Analysis Techniques
6	Information Filtering Systems Based On Web Text Content And Design,
7	Research On Key Problems In WEB Text Mining
8	Data Mining Research In Web Information Retrieval And Classification
9	Studies On Key Techniques Of Text Classification And Mining For Specific Domains
10	Study On Key Techniques Of Web Mining For Intelligent Information Retrieval