| The rapid development of biomedicine has produced a large amount of literature.Even if they only focus on a very professional research field,it is difficult for most biologists to keep up with the research progress in this field.Therefore,efficient and accurate automatic information extraction methods are urgently needed to support the research and development of biomedical related fields.As a basic and key subtask in biomedical information extraction,biomedical named entity recognition plays an important role in various tasks related to biomedical information extraction.Compared with the general field,data in the biomedical field has the characteristics of small data sets available,irregular naming rules,and a large number of unregistered words.Most existing biomedical named entity recognition methods mainly focus on sentence-level semantic information acquisition while ignoring word sense information.Based on the above research status,the research work of this dissertation can be divided into the following two parts:(1)In order to solve the problem of local word sense information loss caused by splitting unregistered words into morphemes after pre-training language model,a word meaning enhancement method based on local morpheme information is proposed.Existing models usually obtain sentence-level semantic information directly after using the pre-trained language model to obtain morpheme embedding representation information,while ignoring the local correlation information between morphemes within a word,resulting in intra-word morpheme labels and cross-word labeling problems in the process of entity recognition,while also exacerbating the gradient vanishing problem due to the increase in sentence length.Therefore,a word meaning enhancement method based on local morpheme information is proposed to strengthen the acquisition of morpheme sequence information within a word: Firstly,the morpheme embedding representation information was obtained by Bio BERT pretrained language model.Then the sequence information between morphemes is obtained,and the semantic information is obtained at sentence level after it is spliced.Finally,the entity is identified and classified with the word as the smallest unit.The experimental results show that the word meaning enhancement method based on local morpheme information can effectively improve the performance of biomedical named entity recognition.(2)In order to further enhance the meaning information representation of words,a word meaning enhancement method based on global morpheme information is proposed to obtain the association information between morphemes more effectively.Firstly,the morpheme embedding representation is obtained through the Bio BERT pretraining language model;then the sequence information between morphemes is obtained and the dependency information between morphemes is obtained by using the biaffine attention mechanism,and the obtained information is fused into a word representation before sentence-level semantics Acquisition of information;finally entity recognition and classification.The experimental results show that,compared with the current mainstream models,the biomedical word sense enhancement model based on global morpheme information effectively improves the performance of biomedical named entity recognition.They are 84.94%,89.09%,92.14% and 76.20%. |