Font Size: a A A

Design And Implementation Of Query And Visualization System For Scientific And Technological Articles

Posted on:2018-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y DongFull Text:PDF
GTID:2348330518476611Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the Age of Big Data,more and more data filled with the Internet.It was increasingly difficult to search for the key information that needed in a large number of scientific papers.With the "Internet +" action plan to promote,all walks of life have joined in the Internet family.How to coordinate the internal members of the family,how to deal with the data in this family,how to make a harmonious coexistence among family members,how to make it easy for members to use the family data.,how to show these abstract data more intuitively.It requires a good mechanism to categorize the data in the Internet.Keyword extraction technology is a good way to solve this problem.Only did the contents of the text are extracted can the mining work and the access the vital information work can be done.Keyword extraction technique is based on the texts,.Whether it is web pages or texts,they are always coming out in the form of articles.Therefore,the keyword extraction technique is to summarize the contents of the articles.Existing keyword extraction technique do not consider the relationship between meaning and word frequency.Therefore,they can not effectively deal with the word sense disambiguation and synonyms merger.Among existing computing texts similarity algorithms,most of them use statistical characteristics to calculate the similarity between the texts.It is a waste of memory and a waste of time.Therefore,this paper proposes an improved extraction algorithm based on semantic keywords and an improved semantic similarity algorithm based on semantic keywords.The algorithms consider semantics and word frequency to make a better text mining.On this basis,using visualization technology to display.The main work and results are as follows:1.Keyword Extraction Algorithm Based on Thesaurus and Connected GraphSynonyms Cilin(Extended Version)semantic dictionary coding simple and the coding indicate similar relationship between words.Connected graph is related to weights and route.Therefore,this article considers both semantics and word feature which proposes a algorithm named KETCG(Keyword Extraction algorithm Based on Thesaurus and Connected Graph).2.Text Similarity Algorithm Based on Semantic Dictionary and Word Frequency InformationCurrently text similarity algorithms calculation method are almost based on the statistical features to acquired similarity between texts.But they are not taking into account semantic relationships between words.So the calculated similarity value sometimes is high or low.To consider the semantic dictionary and word frequency,in this paper,the author come up with an algorithm named TSSDWFI(Text Similarity Based on Semantic Dictionary and Word Frequency Information).Experimental results show that,two algorithms perform well,the extraction of keywords and text similarity have a better computing performance and show a good text mining results.3.In order to make the query data more intuitive display and rich visualization,based on the proposed algorithm,combined with ECharts(Enterprise Charts)visualization technology,the author uses the word cloud and the Force-Directed Graph to display the data.
Keywords/Search Tags:Keyword Extraction, Connected Graph, the Similarity of Text, Data Mining, Visualization
PDF Full Text Request
Related items