Biomedical Named Entity Recognition(BioNER)aims to automatically extract entity mentions such as Disease,Gene,and Chemical from large volumes of unstructured text,which is the basis for downstream Natural Language Processing(NLP)tasks.Current deep learning-based BioNER methods based typically require large amounts of training data.While the annotated BioNER datasets are often difficult to obtain and small in scale due to the limitations of privacy,ethics and high degree of specialization.To alleviate this problem,unlike conventional methods that only use token-level information,we propose a method that can simultaneously utilize the latent multi-source information in the dataset.Concretely,we design multiple auxiliary tasks to make full use of the coarse-grained information implicit in the dataset itself,and thus improve the BioNER performance.On the other hand,most BioNER methods do not consider domain knowledge,this thesis also preliminarily explores the conversion of BioNER into a machine reading comprehension problem by introducing a priori knowledge through well-designed question-answer pairs.Furthermore,current neural architecture BioNER systems regard independent sentence as their training unit without considering the document-level context.Unfortunately,those methods that ignore document-level contextual information often suffer from the tagging inconsistency problem,i.e.,different sentences with the same entity mentions are incorrectly recognized as different labels.To tackle this problem,we propose a document-level BioNER model with an additional cache module that helps capturing the inter-sentence information.To dynamic updating the cache,we design an auxiliary task for measuring the importance of the history encoder states and perform this task simultaneously with document-level BioNER.We used the current state-of-the-art pre-training model BioBERT as a baseline system and conducted experiments on three publicly available BioNER datasets.The results show that the model with introducing intra-sentence coarse-grained information achieves F1 value boosts of 0.40,0.37 and 0.91,respectively,while the model with introducing prior knowledge achieves F1 value boosts of 0.46,0.30 and 0.43,respectively,and the model with introducing inter-sentence information achieves F1 value boosts of 0.30,0.53 and 1.08,respectively. |