Research On Neural Networks Based Uyghur Word Vectors Representation And Its Application

Posted on:2019-10-11

Degree:Master

Type:Thesis

Country:China

Candidate:L H R L Ai

Full Text:PDF

GTID:2428330566467005

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The data representation is the basic task of natural language processing.Traditional data representation refers to the process of manually sorting feature information.In recent years,with the widespread use of deep learning and representational learning,data representation based on neural networks has performed outstandingly in various fields.In the more common natural language processing task,the word-bag model is used as the main semantic representation method.This method results in data sparseness due to incomplete data volume.Therefore,early methods are generally used to solve a certain type of problems,and the application level has limitations.This article summarizes and analyzes the neural network word representation technology,and uses this technique in Uighur morphology induction techniques and text sentiment classification tasks.In the study of word vector representation methods,the existing word representation techniques were analyzed theoretically and experimentally.Theoretically,the theoretical system of Skip-gram model and CBOW model is studied,and the experimental results are compared.Experimentally,the word representation technology was analyzed from the perspective of models,corpus and parameters.After the word vectors were generated using the above two models,the experimental results were evaluated on the performance of the two models in semantic,morphological,and neural network classification tasks.Due to the limited size of corpus,the experimental results of this paper show that the performance of CBOW model is stronger than that of Skip-gram.Based on the morphology induction method of unsupervised learning,only corpus training is needed during the entire process,and no additional morphological linguistic knowledge is necessary.The word vector is used to evaluate the difference rules according to the semantic similarity and morphological difference,and the semantic association is used to evaluate the morphological rules trained during morphological transformation,and this rules are used to build the morphological analyzer.The morphological analysis rules were evaluated using 1000 hand-organized morphologically segmented test sets,and finally an accuracy of 81% was obtained.Based on neural network sentiment classification tasks,theoretical analysis and experimental evaluation of CNN model,LSTM model and Bi LSTM model were performed.In the sentiment classification task,first of all,in the preprocessing part,the stemming,noise reduction and dimensionality reduction is performed on the corpus.Second,the pre-trained word vector is introduced to enable the model to obtain the semantic information between words and words.Make up and increase the emotional characteristic information contained in the corpus.Experiments show that in the same sentiment classification corpus,the morphology induction at the preprocessing stage and CNN model after the input of the word vector initialization model are increased by 1.8%,the LSTM model is increased by 3.7%,and the Bi LSTM model is increased by 3.9%,overall reflects the effectiveness of the classification method.

Keywords/Search Tags:

Word representation, Morphological induction, Neural Networks, sentiment classification

PDF Full Text Request

Related items

1	Research On Twitter Sentiment Classification Based On Sentiment Word Embedding And Convolutional Neural Networks
2	The Study Of Sentiment Commonsense Induced Neural Networks For Sentiment Classification
3	Emotion-enhanced Word Representation Model And Its Applications In Sentiment Analysis
4	The Research On Sentiment Classification Based On The Deep Learning Models For Text Data
5	Document-level Sentiment Classification Based On Dynamic Word Embeddings And Hierarchical Neural Networks
6	Research On Feature Representation Based On Sentiment Classification
7	Research On Chinese Word Segmentation And Sentiment Analysis For Micro-blog Text
8	A Study On Hierarchical Text Representation And Sentiment Classification
9	Research On Word Vector-based Sentiment Classification
10	Text Sentiment Classification Based On Attention Mechanism And Fusion Of Neural Networks