Medical intelligent question answering system is an information retrieval tool for the medical field.Compared with search engines,it is more efficient and accurate.It also helps to meet the needs of users and reduce the burden of medical workers.In medical intelligent question answering system,high quality medical question answering database is the underlying support of medical intelligent question answering system.User question identification and question retrieval are the key links of intelligent question answering system,that is,it is necessary to express question text well and cluster question text to achieve efficient question answering system of medical intelligent question answering system.Due to the short text text of medical problems and the problem of sparse data high-dimensional features,the existing technology is not sufficient for the text representation of medical problems,and the traditional text clustering research usually calculates the text representation and clustering algorithm separately,which affects the clustering accuracy and network generalization ability.Therefore,aiming at the question clustering technology of medical question answering system,this study takes online medical question answering community as the data source to represent and cluster medical question texts.The main work of this study includes the following:Firstly,construct the medical question and answer data set,take the online medical community question and answer community as the data source,formulate the crawl rules and crawl section content,and crawl the real world question and answer data.In order to make the categories of medical problems universal and not limited to a certain disease,through literature research and expert consultation,the text categories of medical problems were divided into 12 categories,including disease awareness,etiology,prevention,examination,diagnosis,treatment,daily care,hospital and medical insurance,prognosis and survival,and the contents of related categories were defined.Secondly,a text clustering model of medical problems based on multi-feature fusion is proposed.The word frequency features,lexical semantic features and cross-text topic features of medical problem text were fused to construct text clustering suitable for medical problem text,and the framework and calculation formula of multi-feature fusion were designed.At the same time,the influence of high dimension of sparse feature data was reduced,and the accuracy of model clustering was improved.Thirdly,A text clustering model based on multi-feature fusion for double-objective self-supervised problems is proposed.In order to further optimize the problem text representation and realize end-to-end text clustering,word frequency and lexical semantics are integrated into weighted word vectors as input to the deep learning model,and self-attention mechanism is introduced to learn semantic information between words.Meanwhile,document-topic distribution is used as cross-document information adjustment.The clustering objective function and cross-document topic objective function are taken as joint loss function to realize end-to-end double objective self-supervision and improve the accuracy of model clustering and model generalization ability.Fourthly,Design an intelligent question and answer system in the medical field based on clustering results.Based on the above two question text clustering methods,design an intelligent question and answer system in the medical field,including data processing,data recommendation,intelligent question and answer and data management functions.To sum up,the Q&A medical data set constructed in this study is of high quality and the category classification criteria have certain universality.The two text clustering methods for medical problems proposed have improved the clustering accuracy and other indicators,and improved the clustering effect,which can be used as reference for text clustering research.Medical intelligent Question answering system Based on clustering results The medical intelligent question answering system provides the functions of data processing,data recommendation,intelligent question answering and data management. |