| In the field of medical health,professional pathological knowledge and the records of patients’ medical treatment are mostly kept in the form of unstructured text,and the data format is different,the structure is variable,and the expression form is multiple,which brings great chal enges to the mining of medical text data.Knowledge Map as a kind of have to attribute graph,save in the form of a triple the relationship between the two entities,it can knowledge representation into more easily by computer recognition and processing in the form of,is increasingly applied in recommender systems,question answering system,ecommerce,such as the major application of auxiliary medical service.Therefore,it is of great practical value to combine knowledge mapping technology with health care.At present,most open knowledge maps are encyclopedic knowledge bases with a large scale,such as baidu baike and wikipedia,etc.But there is a lack of high-quality Knowledge Maps for health and medical fields in the Chinese domain.To solve the above problems,this paper firstly uses crawler technology to crawl massive corpus of health and medical knowledge from online diagnosis and treatment websites,and then uses deep learning technology to train the model and extract medical knowledge.The core tasks include named entity recognition,relationship extraction and knowledge fusion.The acquired medical knowledge was saved in the form of triples into the graph relationa l database Neo4 j to build a complete medical knowledge map from scratch.Finally,based on the above work,Spark big data processing framework was used to design and develop a realtime medical question-and-answer system based on knowledge map,which can answer basic medical questions,and the experimental results have a high accuracy.In this paper,the following work is accomplished:(1)Design and develop a web crawler system,which captures the data of common diseases and their symptoms,clinical drug data,and doctor-patient dialogue data during patient consultation from online medical platforms such as micro-doctor,good doctor online,and doctor-patient website.The next step is to clean the original data and delete the blank data and redundant data.Then,Han LP word segmentation tool is used to divide the text data into Chinese words,remove the stop words,and finally,Word2 Vec is used to construct the word vector.(2)BiLSTM-CRF model was used to identify named entities,which mainly included five medical entities: treatment method,body part,disease symptom,medical examination and disease name entity.Experimental results showed that the algorithm had good accuracy.(3)BiLSTM-ATT was used to extract the relationships among the five identified entities,including the symptom information of a disease entity,the location of a disease,the medical examination and treatment methods for a disease,and other relationships of ten categories.(4)The acquired knowledge was integrated,eliminated and reprocessed,and the merged medical knowledge was saved into the Neo4 j database to complete the construction of the knowledge map for the medical and health field.The map contains more than 44,000 entities of various types and nearly 300,000 relationships.Finally,on the basis of this system,a simple real-time online medical question-and-answer system is built. |