Font Size: a A A

Fast Retrieval Method For Encyclopaedia Knowledge

Posted on:2021-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306107968769Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of internet,society has entered the age of information technology,the way to search for information is gradually changed from a simple and direct way of searching by libraries or databases to access through electronic and online media,and the requirements of information retrieval technology become more and more complex.It is important to improve the speed and accuracy of the return of data to users.This paper focuses on the field of bionic encyclopedias to explore how to efficiently and accurately return user query results.This paper combines the knowledge features of bionic encyclopedia to construct a fast indexing method.The whole approach consists of four main parts.First of all,this paper uses intra-word aggregation and inter-word combination as the standard of the vocabulary extraction algorithm to extract professional words.Secondly,this paper uses TF-IDF and K-means to solve the problem of short query information.Thirdly,this paper combines the Text Rank and Latent Dirichlet Allocation topic model to extract the tags of each encyclopedia entry.Finally,this paper adds the popularity of the content into the calculation formula to display concerned content.This paper extracts the animal and plant information from the Chinese wikipedia to form the bionic encyclopedia entry library.The results demonstrate that the model is better than the benchmark model,professional vocabulary extraction can improve the accuracy of word segmentation,query extensions can improve the accuracy of the results,and tag extraction can improve query speed.
Keywords/Search Tags:information retrieval, query extension, label extraction, professional word extraction
PDF Full Text Request
Related items