| Medical record is the basic material of clinical medical diagnosis and treatment,which is used in the whole process of patients’ clinical treatment.Medical records usually contain a large amount of medical entity data.Identifying valuable medical entity data through named entity recognition technology is of great significance to medical data mining and provides data support for building medical knowledge graph.In the field of medical record named entity recognition,the traditional named entity recognition methods have some problems,such as insufficient feature extraction,long model training time,single neural network model and so on,resulting in weak model robustness and low recognition accuracy.Aiming at the problem of insufficient feature extraction,this thesis proposes a multi feature fusion extraction method based on the combination of semantics,word order and BERT pre-training model.Word2 vec is introduced to extract the semantic features of the text,and Fasttext is used to extract the word order features of the text.The word vector is obtained through the BERT pre-training model to solve the problem of polysemy.Multiple feature vectors are fused to extract and fuse the features of relevant data.After convolution neural network,the fused features are re extracted to obtain more distinctive data features.Finally,named entity recognition is carried out by long-term and short-term memory neural network combined with conditional random field model(BiLSTM-CRF).Aiming at the problem of long training time,a named entity recognition method based on simple recurrent unit neural network is proposed in this thesis.Through the simple recurrent unit neural network,the parallel operation on GPU can be realized to improve the training time of named entity recognition model.Aiming at the problem of single neural network model,this thesis proposes a multi neural network joint model(Text CNN-BiSRU-Self Attention)based on text convolution neural network,bidirectional simple recurrent unit network and self attention mechanism.The SRU neural network is used to solve the problem of long model training time,and the text convolution neural network is introduced to solve the problem that the traditional BiLSTM neural network model can not extract local semantic features.Through the self attention mechanism,the focus of model training is on relevant data,and the influence of irrelevant data on model training is ignored as much as possible.This model solves the problem that the traditional model can not pay attention to the relevant data.Finally,the multivariate feature vectors are fused to fully extract the local and global features of relevant data,so as to improve the accuracy of model recognition.The experimental results show that in the Chinese BLUE(cMed QANER)data set,our model has significantly improved in accuracy,recall and F1 measure value,and the model training time is significantly shortened. |