| The Entity Recognition and Relation Extraction in biomedicine is to recognize the specific entities from texts and determine on the relationships among these entities.Important information such as diseased parts,symptoms,and therapeutic drugs,can be extracted through Named Entity Recognition(NER)from Electronic Medical Records(EMRs)that record the detailed diagnosis and treatment for patients.The Relation Extraction(RE)then applies to determine the relationship among them,contributing to expanded medical applications,for example the drug-drug interaction extraction helps to prevent the coinduction of adverse reaction by several drugs.Therefore,studies on Named Entity Recognition and Relation Extraction in biomedicine bears importance for constructing the Knowledge Graph in the field of biomedicine,supporting the physicians with researching and analyzing on patient’s conditions,and facilitating intelligent medical construction.NER can be divided into category recognition and boundary recognition of entities.NER encounters more difficulties in the Chinese EMRs than in the English ones.The absence of space between Chinese phrases well leads to errors in the boundary recognition of entities.To address the issue,this paper proposes an entity recognition method combining prior knowledge of entities and self-attention mechanism,that is a model based on Bi LSTM-CRF using part of speech tags,which distinguish entities and nonentities,as the prior knowledge to preliminarily separate the boundaries of entities.Then the self-attention mechanism is used to increase the weight of character association within the same entity,and further enhance the performance on the boundary recognition of entities.The paper experimented on Chinese EMRs NER task,and compared to the baseline model,gets the F1 value increased by 12.75% and the recognition of the entity boundaries were enhanced significantly.Drug-drug relation extraction is to determine the relationship between drugs.In this task,there are more negative samples and less positive samples resulting that it is hard to extract the category features.To address this issue,the paper utilizes a pre-training BERT-based model combined with prior knowledge and self-attention mechanism to enhance its performance on extracting the category features.The main innovations in the method proposed by the paper are:(1)As for the more negative samples,the method adopted the rules and templates methods to filter the negative samples that reduced the ratio of positive and negative samples from the original 1:5.92 to 1:2.68.(2)In order to increase the discrimination between different categories of samples,the method used the keywords of each class,acquired through Chi-square Test and Document Frequency,as the prior knowledge of the model,and applied the position coding of the keywords and drugs pair to increase the differentiation of the samples.(3)The model in the paper also employed the attentional mechanism to learn the distribution information of the keywords and other words in the sentences and improved its classification performance utilized the co-occurrence information of the keywords and other words.The experimental results on the public Drug-Drug Interaction(DDI)dataset suggest that the method proposed by the paper can effectively enhance the relation extraction performance and has achieve the SOTA results on the dataset. |