Font Size: a A A

Research On Word Sense Disambiguation Based On Semi-Supervised Ensemble Learning

Posted on:2022-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:J Z XiongFull Text:PDF
GTID:2518306314468764Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a basic research work in the field of natural language processing(NLP),word sense disambiguation(WSD)promotes the development of NLP.WSD is involved in all aspects of NLP.WSD has been widely used in language recognition,speech recognition,text classification,information retrieval,information extraction,text processing and machine translation.Some scholars have introduced semi-supervised learning into the field of WSD in response to the critical problem that annotated corpus is scarce.Due to the continuous improvement of the semi-supervised method by many experts and scholars in recent years,WSD can achieve good disambiguation effect under the semi-supervised method.At the same time,some scholars also found that word vectors trained with large-scale corpus contain rich linguistic knowledge,which is added to WSD model for improving accuracy of model.In order to solve the polysemous problem in natural language,this paper combines semi-supervised learning with ensemble learning to propose a semi-supervised ensemble learning WSD algorithm.Firstly,based on a small amount of labeled corpus and a large amount of unlabeled corpus,semi-supervised algorithm is used to expand the training corpus.Then,basic classification models are combined into an ensemble classifier according to a certain integration strategy.Training corpus is used to train and optimize the ensemble classifier.Finally,test corpus is used to test the performance of the optimized ensemble classifier.In order to introduce the disambiguation method proposed in this paper,this paper mainly elaborates concretly from three aspects:Firstly,this paper analyzes the development of WSD at home and abroad,elaborates some classical methods of WSD at home and abroad,and clarifies the importance of WSD for the development of NLP.Secondly,the process of selecting and processing features in the experiment is introduced.At the same time,three basic classifiers in ensemble learning are introduced in detail.Thirdly,semi-supervised learning and ensemble learning are combined for WSD.The improved label propagation algorithm is used to expand the training corpus,and the training corpus is used to train the ensemble classifier.Finally,test corpus is used to test the performance of the optimized ensemble classifier.Experimental results show that the proposed method is better than a single classifier.
Keywords/Search Tags:word sense disambiguation, semi-supervised learning, ensemble learning, label propagation algorithm, ensemble classifier
PDF Full Text Request
Related items