Breast ultrasound examination has become an important way to diagnose breast diseases and determine the property of breast tumors due to its high accuracy,low inspection cost and multi-faceted observation.Breast ultrasound reports record the patient’s imaging performance and the doctor’s clinical diagnosis,which contain a wealth of breast medical knowledge and extremely high research value.In order to describe the patient’s condition more accurately,the doctors usually use natural language to describe the image performance and clinical diagnosis.However,these unstructured text have the characteristics such as narrative,no structure and inconsistent data description,which are not conducive to automatic analysis and mining data for computer.It has hindered the development of medical big data Therefore,it is necessary to structure breast ultrasound reports to make it service for larger medical data research.In order to solve above problems,this paper proposes a new structural method driven by domain ontology for breast ultrasound examination reports,this method base on traditional structuring technology and combine with textual characteristics of breast ultrasound reports.This method construct the breast ultrasound domain ontology,and then achieve the structuring process of breast ultrasound reports based on domain ontology.The research done in this paper are as follows:1)The breast ultrasound domain ontology is constructed.This paper adopts the idea of automatically constructing ontology,combined with the prior knowledge of pathology and anatomy to obtain the basic framework of breast ultrasound domain ontology;and then perform data preprocessing on breast ultrasound text,including Chinese word segmentation,synonym replacement and text segmentation.The segmented description block is the processing unit.The named entity recognition and entity relationship extraction algorithm are used to obtain entity relationship triplets;finally,these contents are added to the basic framework and form the breast ultrasound domain ontology.2)A structural method for breast ultrasound examination reports driven by domain ontology is proposed.Firstly,the method performs data preprocessing on the breast ultrasound text,obtains multiple description blocks and corresponding domain ontology branch subtrees Then,the leaf nodes and paths of the domain ontology are taken as the research starting point.The obtained branch subtree path algorithm and the generated semantic subtree algorithm are used to perform node information matching in the domain ontology,and the breast ultrasound semantic subtree corresponding to breast ultrasound report is obtained.Finally,the breast ultrasound semantic subtree represented by XML is transformed into breast ultrasound structured data stored in relational tables.3)A mobile system for structuring ultrasound reports is implemented.Users can upload the breast ultrasound reports by taking a picture or inputing text and obtain the structured result,so that users understand their healthy condition,also,this system provide patients with more efficient and convenient medical service.In this paper,the breast ultrasound reports come from one of top three hospitals,these reports are used as experimental data set to verify the effectiveness and usability of the proposed method.Experiments show that the method achieve expected goals and lay a foundation for subsequent research. |