Research On Short Text Aspect Extraction Base On Topic Model And Word Embedding Mechanism

Posted on:2022-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:H X Wu

Full Text:PDF

GTID:2518306548961179

Subject:Engineering (Computer Technology)

Abstract/Summary:

Sentiment analysis is one of the most concerned research directions in the field of text analysis,and its related difficulties can be quickly applied in the industrial field.The aspect extraction task is an important basic work in the sentiment analysis task,and the result will directly affect the effect of sentiment analysis.Traditional aspect extraction algorithms are used in long text types,such as newspapers,articles,blogs,etc,when these models are used in short text scene,the effect is really poor.However,with the increasing popularity of the Internet,the explosive growth of short text data has become more apparent.Therefore,it is very urgent to design algorithms specifically for this type of data.This article focuses on a certain degree of research on aspect extraction algorithms in short text scenarios.The main work and results of this paper are as follows:(1)The data in the short text scene has the characteristics of less vocabulary,large sparseness,and large ambiguity.The traditional long text model is not ideal when dealing with these problems.This paper proposes an improved aspect extraction algorithm based on BTM.The BTM model originally did not consider the impact of the semantic relationship between words on topic mining,and also ignored the semantic information of context words.This article proposes two improvements to this: First,the word vector model is introduced to calculate the relevance between words;second,the self-attention mechanism is introduced to strengthen the semantic relevance between words.This article conducted a series of experiments on two standard data sets to verify that the performance of the model has been significantly improved compared to the previous model.(2)Aiming at the characteristics of short text data such as large sparseness and insufficient contextual semantic information,this paper proposes an aspect extraction algorithm WESM based on word embedding mechanism and self-attention mechanism.The WESM algorithm introduces a word embedding mechanism and a self-attention mechanism on the basis of the vocabulary co-occurrence network,and adds the correlation between words and contextual semantic information,which greatly alleviates the problem of polysemous words in the text.Experiments show that the WESM algorithm has a good performance on two standard data sets.

Keywords/Search Tags:

aspect extraction, topic model, word embedding, attention mechanism

Related items

1	Research On Short-Text Aspect Extraction Based On Topic Model And Attention Mechanism
2	Comparative Analysis Of Patent Literature Based On Deep Topic Model
3	Research On Text Topic Modeling Based On Word Embedding
4	Research On Aspect-Based Sentiment Analysis Based On Attention Networks And Affective Word Embedding
5	Research On Aspect Based Sentiment Analysis In Product Reviews
6	Topic Extraction From Short Texts On Social Media
7	Research Of Aspect-level Sentiment Analysis And Their Application On E-commerce Platforms
8	Research On Aspect-based Sentiment Analysis Based On Attention Mechanism
9	Aspect Based Sentiment Analysis Based On GPT And Attention
10	Topic Modeling Research Based On Word Embedding And Generative Neural Networks