Font Size: a A A

Similarity Research On Medical Literatures

Posted on:2010-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:G G ZhaoFull Text:PDF
GTID:2144360275465354Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Fast development of science and technology has benefited almost every scientific research field.Along with this fact comes the incredible amount of information.A new storage technology,database,has been invented to tackle the storage problem of massive data.Thus,how to organize data of different formats into one unified database so that retrieval is facilitated is becoming the focus.There have been various academic databases home and abroad such as MEDLINE,SWlC,and CNKI. Although there are rich database resources and varied retrieval methods,the retrieval results still fail users' expectation by containing many irrelevant results. Consequently,in retrieval field,the problem that calls for immediate solution is how to improve retrieval efficiency,correctness and relativity.We design a domain-independent storage approach for kinds of medicine document information and unify different data formats with the database designed;we build VSM based on content for every document using the database and suffix tree VSM,calculate similarity between them with vector cosine and at last establish the Related Articles Database(RAD) for the Chinese Biomedical Engineering literatures. Based on the RAD,we design a medical document information retrieval system.This paper is divided into five chapters.The first chapter is a review of the research work concerning related articles database retrieval home and abroad,storage method of current search engines and the storage method for medicine documents. Secondly,we will show the design,work flow and functions of our system.Thirdly,a heterogeneous data storage approach is described in details.Fourthly,an explanation, about how to build RAD will be provided.Also,we will evaluate the current computation method and the correction of retrieval results based on the database. The final part is a summary of our work.
Keywords/Search Tags:retrieve, topic-specific search engine, suffix tree, similarity computation
PDF Full Text Request
Related items