Font Size: a A A

Research On Key Techniques Of Intelligent Meta-search Engine

Posted on:2010-07-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:H M LiFull Text:PDF
GTID:1118360302969352Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Because of the defects of low recall and precision, existing search engines can't satisfy the speediness and validity of users'information requirement. Meta-search engine provides unified access to multiple existing search engines, which can increase the search coverage of the Web and improve the retrieval effectiveness. The key issues faced by the meta-search engine concern the improvement of intelligence, the design of advanced user interfaces based on a structured organization of the results and the development of personalized search.The state of arts of intelligent meta-search engine is summarized in this dissertation. Then an intelligent meta-search engine model based on Multi-Agent is designed and some related key techniques are studied. Data mining can extract hidden knowledge, which is a novel solution for the utilization of Web information when applying Web data mining technology to the search engine. Agent technology is developing rapidly and can be applied to the personalized intelligent information retrieval. By combining data mining technology, intelligent agent technology with the meta-search engine, the intelligence of the meta-search engine can be improved. The contributions of this dissertation are as follows.1. An architecture model of the intelligent meta-search engine based on Multi-Agent is designed, which assimilates superior elements of clustering search engine and personalized information retrieval. The cooperation between mobile agents and stationary agents is used to make the system more efficient, and a reduction parallel algorithm is executed in stationary agents to merge the search results, which can avoid the bottleneck created on the side of the merging agent. In addition, an algorithm of founding and updating the user profile is proposed and a retrieval method which combines personalized search and clustering is presented. This retrieval method can provide personalized service for users and improve the retrieval precision; it can also provide structured organization of the search results, which can facilitate users'browsing through the search results.2. A selection strategy of underlying search engines based on virtual language model is proposed. Search engine databases are associated with the concepts and the resource descriptions of databases are acquired through static learning. When a query is received, it is mapped to several related concepts, and then a personalized selection strategy of underlying search engines is presented based on virtual language model and user model. The algorithm can remedy the problem caused by short queries in Web information retrieval and increase the speed of database selection. The experimental results show that the proposed algorithm is more effective than CORI.3. A result merging method based on Group Decision Making activity is presented to reduce the inequality of search engines. Firstly, the relevant scores of documents are normalized by incorporating text analysis with existing rank-based method. Secondly, an improved shadow document method is proposed to estimate the scores of non-relevant documents. Finally, the relevance of the search engines to user's preference is taken into consideration, and a merging method based on Group Decision Making activity is adopted to sort the search results, thus the results more relevant with user intention will be ranked higher in the merged result. The experimental results show that the performance of proposed algorithm exceeds individual search engines and it is more effective than Round-robin, CombSum and CombMNZ.4. A new clustering algorithm for Web search results based on conceptual grouping is proposed. The conceptual grouping method is improved, which can break through previous limitations of queries. The semantic relationship among index terms is mined and these terms are grouped together to form candidate clusters related to the query topic by their semantic coherence, then search results are assigned to relevant clusters. The cluster labels are selected by calculating the importance of terms in the search results and clusters, thus the cluster labels are feasible in recognizing the topics. In the new algorithm, the term extraction method ensures that the cluster labels are unambiguous, and the conceptual grouping technique can discover the topics of search results. The algorithm can meet the demands of online, semantic and overlap clustering. And moreover its computing complexity is lower. The experimental results show that the performance of the proposed algorithm outperforms K-means algorithm. Compared with Chinese clustering search engine bbmao, its cluster quality and cluster labels are similar to bbmao; however the selected labels are more apprehensive.
Keywords/Search Tags:Web Information Retrieval, Meta-search Engine, User Profile, Clustering Search Engine, Intelligent Agent
PDF Full Text Request
Related items