Search Engines, Web-based Resources

Posted on:2008-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2208360212499928Subject:Software engineering
Along with the rapid development of computer, multimedia and modern communication technology, primarily based on the printing literature school library starts to transform to the digital library that based on electronic information and the hypothesized information .The digital library is one kind of emerging large-scale distributional information system, which includes lots of information resource. How to find the information resource yourself wants, which is the function of resources search engine.The network information retrieval as a new information resource retrieval pattern, the main specificity lies in the network environment to bring the information resource distribution and the digital technology multimedia, thus has caused the information retrieval process each essential factor quantitative change and the qualitative change. Since continuously, performance enhancement of the network information retrieval has attracted attention. Of the information science, computer, artificial intelligence. The user is the starting point and the end point of information retrieval system, thus demand comprehensive, the accurate assurance of the user is a key aspect for improving the retrieval quality. This article has discussed the application of data mining in the gain user demand aspect, proposed a search engine model based on the Web text collection resources. This model through to excavates the user interest correlation Web text collection analysis that the user not to express, the latent information need, and through user's alternately revision excavation result, track user's interest change.This article after research and analysis the essential technology which involves in number and the model, namely the characteristic withdraws, the automatic participle, the machine learning and the automatic sorting technology, proposed the model overall design in this foundation. Introduce the several technologies of how to realize: Pccs partially gathers and classified technology, the user interest expression technology.Finally, the article through contrasted between the pccs algorithm and other algorithm performance comparison, and proposed further consummates aspect.
Keywords/Search Tags:The information retrieval, network information retrieval, data mining, characteristic withdraws, the machine learning automatic participle, content retrieval
