Font Size: a A A

Web Text Mining Research Based On Subject-oriented Search Engine

Posted on:2007-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:F GuoFull Text:PDF
GTID:2178360182993932Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with the development of technology in network information and the gradual popularization of Internet application, WWW has already become a huge information storage and information publish space. However, because the data have the characteristics of non-structure, non-index, we hardly use its abundant information fully. In order to discover an interesting topic in the voluminous information, we have to do some research work in WEB text mining, which has become a promising research direction in data mining., Based on " 'Gansu Province natural sciences fund 'face profession application search engine'" , we have developed a Chinese name-oriented search engine and studied the technology of classifying and clustering of the WEB text search engine. In this research process, we deeply studied predecessors' work which is the foundation of our latter work. Utilizing related knowledge of IR (Information Retrieval), the information extracts and the data mining, we proposed compensated information extraction text classification (CIETC) , and completed the process of search engine of WEB text mining faced the personal names. Compares with Vivisimo, because this is an application faced the profession domain, the classified and clustering result is extremely good. This is an effective method with practicability which has been proved by the practice.
Keywords/Search Tags:search engine, information extraction, text classification, natural language meaning, data mining
PDF Full Text Request
Related items