| With the continuous development of medical informatization,medical data has exploded,and the transmission,storage,processing and visualization of medical big data are all facing considerable challenges.The information systems of medical institutions are independent of each other.Under the guidance of various policies,an electronic medical record system,a regional(or cross-domain)electronic health record platform,and a regional data center have been established.The increasing amount of patient data has caused doctors to query patient history records.There are problems such as cumbersome operation,delayed data presentation,and low efficiency.How to solve the problem of transmission and storage of medical data within and between medical institutions has become a shortcoming of the current medical information system.After the concept of smart medical treatment is put forward,the restrictions on the local transmission,storage and use of medical data within medical institutions are broken,resulting in the sharing of regional medical data,which has higher technical requirements for the transmission,storage and use of medical data.Based on the rapid development of smart medical care,in order to solve the problems of heterogeneous collection of medical data,low transmission efficiency of mixed medical data,bottlenecks in the storage performance of massive medical data,and difficulty in the extraction and visualization of key information,this paper proposes“intelligent 3D visualization digital Patient Platform”,by collecting and processing historical data and real-time data of a(or regional)medical institution and establishes a set of visualization platform aimed at improving the rate of diagnosis efficacy.The main contents of this paper include:1.The design of multi-source parallel acquisition subsystem.The platform collects all kinds of medical data from multiple hospitals or medical institutions to the intelligent 3D visualized digital patient platform to realize medical data cleaning and unifies format processing in different formats,as well as subsequent transmission and storage of medical data.2.The design of the container management platform.The Kubernetes-based container management platform is used to deploy the Docker container engine and run the Apache Flink distributed big data processing cluster and PostgreSQL database cluster in the container.The adoption of container technology realizes the lightweight and portability of applications,which is more conducive to the subsequent expansion and deployment of data access and processing clusters.3.The design of big data storage framework and big data processing.The platform establishes a mass medical image storage module based on HDFS distributed architecture and a storage subsystem module that stores mass medical text data based on PostgreSQL databases.The storage subsystem and the acquisition subsystem are connected by the Apache Flink distributed big data processing cluster.Apache Flink can realize two data processing modes of batch processing and stream processing at the same time,so that the hospital’s historical data batch processing and real-time data stream processing can be transmitted to the intelligent 3D visualization digital patient platform.The storage subsystem integrates the patient’s text and image medical information,eliminating the need to access multiple medical information systems within the hospital and the regional medical system.4.The design of key information extraction algorithm.The platform uses natural language processing technology and statistical methods to realize the key information extraction algorithms for two text reports of radiology information system and pathology information system.The algorithm can extract the key medical information in these two unstructured text reports and generate the JSON format of key-value pairs for transmission and storage.In addition,extraction adapters are designed for different types of structured electronic medical records to extract the key information in the nodes and update them to a JSON format file of key-value pairs.This technology can convert the original unstructured text report into structured report data,which can make the report data more clearly presented to patients or doctors.At the same time,it will also help the intelligent medical platform to learn patient data models.Based on the existing visualized digital patient and intelligent 3D visualized digital patient display system in the laboratory,this paper carries out the research on the architecture of the multi-source data collection,storage and information processing platform.The text data is unified in the multi-source parallel collection of medical data.The platform realizes the large data processing framework and the transmission of batch-type historical data,streaming real-time data in medical institutions and the extracted key medical information to the storage subsystem of the platform with low latency.The platform uses hybrid storage subsystem to improve the storage efficiency and capacity of massive medical data.The platform adopts Docker container engine and Kubernetes container management platform to improve the operating efficiency of the entire platform.The platform combines natural language processing technology and statistical methods to realize radiological information system and pathological information.The key information of the two text reports of the system is extracted,transmitted,stored and integrated with the intelligent 3D visualized digital patient display system. |