Font Size: a A A

Research On Classification Method Of Time Series With Massive Missing Data

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q T LiFull Text:PDF
GTID:2370330611466953Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
After being in hospital,the patients are monitored for various physiological indicators under the guidance of doctors to know more about theire physical condition.These physiological indicators will form large amount of clinical time series.We can base on them to predict in hospital mortality,illness,and length of hospital stay.The missing rate of clinical time series is much higher than that of normal time series.Because the doctor will only choose the variables related to the patient's condition for monitoring,and different variables have different monitoring frequencies.In terms of research on clinical time series classification problem,there are not many related models,but in recent years there are still two novel models,namely GRU-D and channel-wise LSTM.Their superior performance makes them applicable to many problems.But also their shortcomings cannot be ignored.GRU-D does not consider the missing rate of the variables,and it adds the real value and the padding value together during the training process,so that the model cannot directly sense the change of the real value and cannot receive each variable and its missing mark at the same time.Channel-wise LSTM uses an independent LSTM model for each variable,which makes the model's calculations huge.Inspired by GRUD and channel-wise LSTM,this paper proposes a model named variable sensitive GRU(VSGRU)that can independently sense variables without bringing more calculations.It has three innovations.First,because the missing rate of a variable is closely related to its importance to the evaluation of the patient,VS-GRU not only considers the missing mark of the variable but also considers the missing rate of each variable separately.Second,VS-GRU uses a simple architecture to allow the GRU to independently mine different variables at the same time.Therefore,variables with a high missing rate will not interfere with the information extraction of variables with a low missing rate.Moreover,the model can be more sensitive to changes in real observations and can sense both a variable and its missing marker at the same time.In order to solve multi-label learning problems which are more complicated,based on VS-GRU,its information integration enhanced version VS-GRU-i is proposed.It consists of two layers of GRUs.The first layer is VS-GRU to extract information separately.The second layer is another GRU which is responsible for integrating the information extracted from the first layer.Third,both VS-GRU and VS-GRU-i have adopted a framework of deep supervision,supervising the output of each time step to reduce the probability of errors in the training process.In this paper,two real clinical datasets MIMIC-III and Physio Net are used for experiments.In the four classification tasks,VS-GRU achieves the best performance in single-label classification tasks,while VS-GRU-i performs the best in multi-label classification tasks.
Keywords/Search Tags:multivariate time series classification, missing values, electronic health records, deep learning
PDF Full Text Request
Related items