Font Size: a A A

Research On Word Sense Disambiguation Based On GCN Model

Posted on:2024-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:R LiuFull Text:PDF
GTID:2558306920454664Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Word Sense Disambiguation(WSD)is an important research topic in natural language processing,which is widely used in text classification,machine translation and information retrieval.But,polysemy phenomenon in language affects accuracy of the above applications.In order to promote the better development of the above applications,WSD is needed,which lets computer recognize true meanings of ambiguous words in context.In order to solve this problem,this paper studies disambiguation knowledge and neural network.Based on neural network,WSD model is constructed,which improves the effect of semantic classification.In this paper,methods of extracting disambiguation features are studied.The sentence containing all words,parts of speech,semantic categories and words of all adjacent units around the left and right sides of ambiguous word are selected as disambiguation features.Methods of transforming disambiguation features into disambiguation feature vectors are studied.Word2 Vec and Doc2 Vec tools are adopted to transform disambiguation feature vectors.WSD method based on Graph Convolutional Neural Network(GCN)is proposed.Disambiguation features are creatively used as nodes in the graph to construct WSD graph.Pointwise Mutual Information(PMI),Term Frequency-Inverse Document Frequency(TF-IDF)methods are used to calculate node and edge weights,which are embedded in graph.And softmax classifier is applied for semantic classification.In this paper,WSD models based on GCN,machine learning classifiers and neural networks are fused for WSD.Use GCN to extract disambiguation features in graph,and use Support Vector Machine(SVM)classifier to improve disambiguation accuracy.WSD models based on GCN and Long Short-Term Memory(LSTM),and WSD models based on GCN and Bi-directional Long Short-Term Memory(Bi-LSTM)are proposed.SVM classifier are applied for WSD.Training corpus of Sem Eval-2007: Task#5 is used to optimize disambiguation model,and its test corpus is used to testify the performance of disambiguation model.Experimental results show that WSD based on GCN model is superior to WSD models based on other neural networks.WSD methods based on GCN and machine learning classifier and other neural networks can further improve disambiguation accuracy.The disambiguation effect of GCN and Bi-LSTM WSD model is better than those of other models.
Keywords/Search Tags:Word Sense Disambiguation, Graph Convolutional Neural Network, disambiguation features, Bi-directional Long Short-Term Memory
PDF Full Text Request
Related items