Font Size: a A A

Research On Medical Data Processing Technology Based On Hadoop

Posted on:2018-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:S K LiangFull Text:PDF
GTID:2428330542972042Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and widely used in the medical field,medical information process continues to accelerate,the rapid emergence of medical data Massive medical data and complex data types have put tremendous pressure on the medical industry to store and process these data.The emergence of cloud computing to deal with mass medical data provides a new idea.Open source framework Hadoop is an important part of cloud computing technology,which provides a platform for mass data distributed storage and computing.In this thesis,the problems in mass medical data processing and analysis are studied.This thesis first studies Hadoop cloud platform core components HDFS and MapReduce,for its storage of large number of small files when there are memory bottlenecks and file retrieval efficiency is low,put forward a suitable for storing a large number of medical small file method.By introducing the file preprocessing module,combining a large number of small files into a sequence file and writing the corresponding information into the composite index can effectively reduce the number of files in the cluster and improve the memory efficiency of the cluster.By using the combined index,In the guarantee of user information security and accurate positioning need to retrieve the file case,can effectively improve the file retrieval speed.Experiments show that this method can effectively solve the problems when Hadoop stores small files.Secondly,the Apriori association rule algorithm is studied,and the relationship between the medical data is analyzed.The algorithm is improved by introducing the interest degree for the large scale and the long scanning time.The algorithm is ported to the Hadoop platform.An improved Apriori medical data mining algorithm based on MapReduce and interest is proposed,which can adapt well to the high concurrent environment.Experiments show that the transplanted Apriori algorithm has good parallel spreading ability.Finally,based on the medical small file storage and medical data analysis technology,this thesis builds a medical data storage and analysis system based on Hadoop,and introduces the function of the main function module of the system.The process of system environment construction is introduced in detail,which guarantees the realization of system function.The final system provides users with the user interface for document upload,file search,disease and its complications.The user interface is used to verify the function of the system.The result proves the reliability of the related functions of the system.
Keywords/Search Tags:medical large data, cloud computing, Hadoop, small file storage, Apriori algorithm
PDF Full Text Request
Related items