| With the rapid development of the Internet,the text information in the network is also increasing.As the most common form of expression,text information has naturally become a research hotspot.Text classification is an important task in natural language processing.It can effectively identify and integrate information and realize automatic classification.As the basic technology of data mining,text classification technology has been widely used in information retrieval,information management,semantic understanding and other fields,and has achieved good results.As the medical field is a specific field of text classification,the medical text may contain more complex and professional medical vocabulary,which makes the data sparse and high in latitude.This leads to greater difficulties and challenges in text classification in the medical field.Among them,doctors’ clinical diagnosis of patients,online diagnosis and the activity process of patients’ medical records are all important medical text resources.With the development of Internet technology and the popularization of electronic medical records,a large number of electronic medical records and related records have been accumulated,providing valuable data resources for information mining and classification in the medical field.In recent years,deep learning has been widely used in natural language processing and has achieved breakthrough achievements.Recurrent Neural Network(RNN)and Convolutional Neural Network(CNN)have become two mainstream models in natural language processing.At present,the classification tasks of medical texts are divided into different categories,such as the classification of medical texts of electronic medical records,and the classification of medical documents.Due to the outstanding performance of RNN in natural language processing and good achievements in reading comprehension and relational reasoning,this article mainly studies and compares network models related to RNN.In the text classification model,the quality of context-sensitive representation directly affects subsequent natural language processing tasks,and the impact of rich context-sensitive representation on text classification is also more obvious.Aiming at the shortcoming of LSTM(Long Short-Term Memory Network)that can only obtain one-way information,this paper uses BI-LSTM(Bidirectional Long Short-Term Memory Network)to obtain the context-sensitive representation of the text,and improves the traditional BI-LSTM.This paper proposes a new neural network model for medical text classification.The model divides medical text into sentences and constructs context-sensitive sentence representations.In the contextual representation of the sentence,the improved BI-LSTM is used to obtain the contextual features of the sentence,and the attention mechanism is used to obtain the contextual representation with important word weights.That is,BI-LSTM is used to extract features and contain sentence information,and the attention mechanism is used to encode sentences using the improved BI-LSTM in order to obtain different sentence weights,and then use the attention mechanism to decode the sentences.Finally,the medical text category is obtained and output through the Soft Max classifier.In order to verify the effectiveness of the model proposed and constructed in this article,this article selects five public data sets THUCNews,online_shopping_10_cats,Sogou CA,waimai_10k,simplifyweibo_4_moods,and the patient input data set crawled by this article on the "Good Doctor" website for experimental comparative analysis.The experimental results show that the text classification accuracy rate of the improved LSTM+Attention on the THUCNews data set is greatly improved compared to the three types of text classification models HAN,Text-CNN,and Text-RNN,and is compared with the best text-CNN Baseline model,the classification accuracy rate increased from 92% to 93.13%,the classification accuracy rate on the online_shopping_10_cats data set increased from 87.42% to 90.99%,and the classification accuracy rate on the medical data set constructed in this article increased from 84.37% to 91.35%.At the same time,in order to verify whether the classification model has good generalization performance,this paper conducts test experiments on the mixed data set.The experimental results show that the text classification accuracy rate of the improved LSTM+Attention on the mixed data set is compared with the best HAN baseline model,and its classification accuracy rate is increased from 85.41% to92.54%.The above experimental results all prove that compared with other advanced text classification models,the model used in this article has the best effect for text classification.The main contributions of this paper are as follows:1.Due to the large number of complex words and unique sentence representations in medical texts,this paper proposes a new medical text classification model to solve these problems.2.Improve the traditional BI-LSTM model by increasing the interactive transmission of two LSTM logic lines to enhance the interactivity of the text.3.The classification model introduces an attention mechanism,which acquires and integrates important information of different parts of the original text through a multi-head mechanism,and enhances the interpretability of the original text.4.Designed and developed a medical triage system based on deep learning... |