Font Size: a A A

Research On Biomedical Word Sense Disambiguation Based On Attention Neural Network Model

Posted on:2022-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:S Y PangFull Text:PDF
GTID:2480306614460024Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Biomedical word sense disambiguation(WSD)is widely used in biomedicine field.Machine translation,text mining and gene naming standardization are all valuable research topics in biomedicine field.However,biomedical texts are complex and diverse.Correct meanings of professional vocabularies can not be obtained if they are processed automatically by machines.So,biomedical WSD is the basis of the aboved work.Based on the research of biomedical WSD knowledge and a variety of neural network models,this paper combines attention mechanism,multi-scale asymmetric convolution neural network(MACNN)and bidirectional long short term memory(Bi LSTM)to propose a biomedical WSD method based on attention neural network model.This method achieves high disambiguation accuracy in MSH WSD dataset,which has great reference value to the future research on biomedical WSD.The specific work is as follows:(1)A MACNN model using multiple convolution kernels of different sizes to process data is proposed.The model can obtain more features and introduces the idea of asymmetric convolution.Through experiments,disambiguation performance of the model with different sizes and numbers of convolution kernels is compared.The results show that when convolution kernel size is 2,3,and 4,MACNN model has the best disambiguation performance and average accuracy of disambiguation reaches84.54%.(2)MACNN-Bi LSTM model is proposed,which combines ability of MACNN to extract features and the ability of Bi LSTM to obtain contextual information.Through experiments,the effects of different number of nodes in the hidden layer and layers on disambiguation performance of LSTM model is compared.The results show that when the number of hidden layer nodes is 100,disambiguation performance of LSTM model is the best.Disambiguation average accuracy of MACNN-Bi LSTM model reaches 85.78%.(3)This paper proposes attention neural network model.In semi-supervised way,Xgboost algorithm and Light GBM one are used to expand the training corpus.The expanded training corpus is applied to optimize attention neural network model and test corpus is adopted to testify its performance.Experimental results show that disambiguation average accuracy of attention neural network model by expanded corpus reaches 87.59%.
Keywords/Search Tags:Biomedicine, word sense disambiguation, attention neural network, semi-supervised method, disambiguation feature
PDF Full Text Request
Related items