Font Size: a A A

Research On Vertical Retrieval System Of Nuclear Energy Based On User Behavior Analysis

Posted on:2016-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2308330473957077Subject:Software engineering
Abstract/Summary:PDF Full Text Request
China’s www.nuclear.net.cn provides timely and comprehensive information. How to extract useful information of nuclear power from desultorily big data information with strong interference poses great challenges to the information processing ability of human intelligence. Due to the complexity of web pages, general search engines cannot meet the exact requirements of users, In order to obtain more accurate and detailed professional information of nuclear power, provide accurate data for China nuclear network, ensure get timeliness, accuracy, comprehensiveness nuclear energy data for the portal, this dissertation study on obtaining and classified mass of nuclear data efficiently and retrieval of information.Firstly, the dissertation researches on the topic crawler for the field of nuclear energy, realization of the nuclear mainstream data capture, de-noising. Secondly, we had huge amounts of data in accordance with the classification of China’s nuclear network channel based on SVM and IKAnalyzer. Finally, we improved the ranking algorithm about Lucene’s framework, proposed a new retrieve sorting algorithm based on user behavior analysis, which can greatly improve the efficiency of search engine queries, and provide more effective and accurate information for China’s nuclear power nuclear network.Similarity scoring algorithm as the key steps of full-text information retrieval, can efficiently display the returned results. After the research of Lucene’s internal similarity scoring algorithm, This dissertation implements the vertical search system for nuclear energy, which is based on the similarity score improved algorithm, analyzed user’s most recent search and click behavior and obtained and sorted a set of user preferences keywords.In this dissertation, experimental studies have shown that information system can efficiently obtain nuclear energy and ensure the accuracy of classifying information, save human resource effectively. Because user search behavior embodies user search purpose and search habit, returned results with higher user interest are listed in the top of the matching results, which conforms to user’s search intent. By indexing the crawling information and verifying retrieval performance, shows that our method can output more accurate searching results, and is more in accordance with user search behavior.
Keywords/Search Tags:User Behavior Analysis, Similarity Scoring Algorithm, Support Vector Machine, Information Retrieval System
PDF Full Text Request
Related items