Font Size: a A A

Visual Analysis System Of Chinese Natural Language Processing Model

Posted on:2021-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z L DuFull Text:PDF
GTID:2428330605461310Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the overall popularity of the Internet,the text-data from news,blogs,microblogs,forums and other platforms is growing exponentially.How to make the computers better understand natural language is the current research hotspot.Since each natural language is a highly complex symbol system,it is a very challenging task to discover knowledge from text.LDA-based topic model and word embedding model are two effective natural language modeling techniques,which can to a certain extent improve the computer's ability of natural language understanding.We conduct a research on different visualization tools for the LDA model and word embedding models.Our results show that most of the existing visualization tools are designed for English.In this thesis,we design and implement a Chinese visualization analysis system for the LDA model and the word embedding models.The functions provided by the system mainly include the following two aspects:Visualization of LDA-based topic model includes three parts:search results visualization,news topic visualization,topic distribution visualization.Search results visualization is to extract the news which are selected by the user from the database and display the results on the page in chronological order for the user.News topic visualization is to train the news within the date selected by the user through the LDA model component and display the results visually on the page.It is convenient for users to understand the news topics and the hot information.Topic distribution visualization is to mark the words of different topics in each news text with the different color,and display the marked news on the page.It can help users to understand the distribution of topics in each news.Visualization of word embedding models includes three parts:similar word visualization,similarity visualization,word analogy visualization.Similar word visualization is to visually output multiple words that are closest to the input word in the vector space.It can help users to understand the distribution of words that are semantically similar to the input word.Similarity visualization is to determine the correlation between two words by calculating the angle cosine similarity value,and display the similarity value on the page.It is convenient for users to analyze the connection between different words.Word analogy visualization is to make an analogy between different words,and finally arrange the results on the page according to the results of similar values from high to low for users.
Keywords/Search Tags:LDA, word embedding, visualization
PDF Full Text Request
Related items