Font Size: a A A

Research And Application Of Inteligent Retrieval On Weibo Oriented Medical Information

Posted on:2015-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiuFull Text:PDF
GTID:2298330467963472Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
With the extensive application of the Internet, data on the web is growing exponentially. So, how to find target information in such a massive level data and its automatic processing has become the focus of the current research. Now the information data generation rate increased sharply, Microblog, WeChat and other new media representative, because they can generate a huge fragmentation of information. However, the combination of these traditional areas of knowledge and intelligence technology is a new research direction, resulting in intelligent search engine based on massive data. Intelligent search engine design goal is based on the user’s request, retrieves the most valuable information to users from a network resource.This paper mainly describes the collection and storage of information on the web microblog, intelligent information retrieval theory and algorithms, Clustering algorithms and applications in data mining and so on. Firstly, Using Sina Weibo API interface to crawl relevant content according to pre-defined fields.Then, the information stored in bulk in a non-relational database MongoDB. Use Lucene4.0tools to create index files and stored in the local in accordance with the information which database MongoDB has been deposited. Use open source framework carrot2to associate with index files which already established. Carrot2is a library package, which allows you to create a search engine based on clustering. As long as provides search keywords, the cluster-based search engine could be get relevant results set. Using Lingo clustering algorithm to operate the result set. Finally, show the user clustering results in the form of a tree diagram. Through the achieved module and accompanied by a biomedical model shift as well as people increasingly high demand for health, people’s awareness from disease-centric to a patient or human-centered. In the end, this paper established intelligent search system based medical and health information microblog oriented.The end of the paper, to optimize the clustering results generated by the intelligent search system. The main work is correcte dictionary files and stop word files which were referenced by lingo clustering algorithm in the system. Furthermore, optimized clustering results figure can be obtained. Intelligent search engine system that achieved in this article in order to expand in the future applications, had already analyzed and put forward several suggestions for improvement. Thus, you can make the system more robust, improve the performance and functionality.
Keywords/Search Tags:Microblogging, MongoDB, intelligent informationretrieval, carrot2, text clustering, Lingo algorithm
PDF Full Text Request
Related items