| Today’s society,health care has received increasing attention,especially the fitness of mother and infant,which is a great concern to a family happiness.Intrahepatic cholestasis of pregnancy(ICP)is a common disease in the middle and late stages of pregnancy.This disease occurs at up to 12%,which may lead to a suite of poor pregnancy consequences and even threaten the life of the fetus.Since the pathogenic mechanism of ICP is still unclear,early detection,early prevention and scientific treatment are currently ideal diagnosis and treatment programs.On the basis of clinical medical research,this paper studies the early prediction model of three key biochemical indicators of ICP based on data mining technology,so as to assist doctors in predicting the disease and realizing the early diagnosis of ICP.The early prediction research of ICP can be divided into two stages: clustering and prediction.First,in order to perform clustering analysis more accurately,an improved distance measurement algorithm is proposed to measure the similarity of multiple time series,so that ICP data with similar changes can be clustered,which is beneficial to improve the accuracy of the prediction model.After that,an ICP early prediction model is constructed based on the improved fusion algorithm,which can predict mid-stage and late-stage indicators based on early pregnancy biochemical indicators.finally,an early prediction system for ICP diseases is designed and implemented.The specific research is as follows:(1)In view of the problems of the time series similarity measurement based on the dynamic time warping distance(DTW),such as the calculation complexity is high,and in order to pursue the minimum distance in the distance measurement process,the phenomenon of transitioning one-to-many mapping may occur during the matching process,resulting in serious distortion of the sequence.In order to affect the measurement accuracy,a dynamic time warping distance SF-AWDTW based on segmented features and adaptive weighting is proposed.Firstly,the time series is processed by multidimensional segmentation,then the feature representation uses the slope,maximum value and time span of the segmented interval fitting line segment.This process not only retains the correlation between variables,but also expresses the morphology and range characteristics of the sequence,thereby achieving the purpose of dimensionality reduction.In view of the transition matching phenomenon of DTW,each data point is given a cost weight,and in the process of seeking the minimum path,the weight is adaptively adjusted according to the frequency of use,in order to limit the number of matching sequence points in the curved path.Finally,after comparing and analyzing a lot of experimental results,it is verified that the distance measurement method based on SF-AWDTW has high measurement accuracy on a variety of data sets.It shows that the algorithm can effectively improve the problem of DTW that affects the measurement effect due to over-matching.(2)Aiming at the problem that the long-short-term memory network(LSTM)single prediction model is built directly on the time series,its prediction accuracy is not high,and the cumulative error in the time series prediction process affects the prediction results,a dual path based on ARIMA correction is proposed.LSTM time series forecasting model.First,in order to reflect the trend change characteristics of the ICP time series in the prediction model,the trend sequence of the sequence is extracted,and the sequence composed of the trend change characteristics of the sample and the index value sequence are used to construct the LSTM model,and the two predicted values are combined.The preliminary prediction result is obtained.In order to further improve the prediction accuracy,an ARIMA prediction model is constructed on the residual sequence formed in the preliminary prediction results,and the purpose is to modify the preliminary prediction value of the dual-path LSTM model.Finally,it is confirmed through experiments that the prediction accuracy of the dual-path LSTM that incorporates the trend change feature sequence has been improved accordingly,and then the residual value predicted by the ARIMA model is corrected to further improve the prediction result.At the same time,by comparing the fitting curves of several fusion models to ICP prediction,it can be seen that the prediction results of the proposed prediction model are closer to the true value.(3)Design and implement ICP early prediction system.Through detailed demand analysis and design of the system,the development of main functional modules such as medical record management and disease prediction has been completed,and the application of the dual-path LSTM-ARIMA prediction model in the prediction system of intrahepatic cholestasis during pregnancy is realized.Provide a better medical experience with patients. |