Font Size: a A A

Research On Nosocomial Infectious Prediction And Unbalanced Classification Based On Active Learning And Generative Adversarial Networks

Posted on:2020-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2404330599454650Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nosocomial infections are a topic of widespread concern in the medical field,concerning the health of inpatients and the quality of hospital care.The application of artificial intelligence and machine learning methods in the institute of infection has been a concern of both academic and industrial circles in recent years.The traditional research method for the prediction and classification of nosocomial infections is based on the fact that the training data set is a basic equilibrium ideal,which is inconsistent with the facts.Like the research problems in the fields of medical diagnosis,credit card fraud detection,text category detection,etc.,the research topic of this paper,the prediction of nosocomial infection,also has the problem of imbalance of data classification.Data imbalance problem means that one type of sample in a dataset is much larger than other types of samples.In combination with our research problem,the number of inpatients without infection is much larger than the number of inpatients with infection,and this data imbalance situation will affect the classification performance of the model,even severely,the classification performance will drop rapidly.This paper mainly reduces the influence of the in-hospital infection prediction model on the imbalance problem from two aspects.On the one hand,this paper generates a large number of small class samples based on the generator from generative adversarial networks.Those generated samples make up for the imbalance of the training data.And considering the influence of the amount of information generated by the data on the model,active learning mechanism is used.Based on two of both,this paper proposes a framework to solve data imbalance problem in nosocomial infection scene.The active learning framework named activeG.This framework proposes a batch query sampling strategy based on generating data in the active learning selection strategy.The innovative sampling standard not only considers the entropy in information theory,but also considers the similarity difference between the generated data and the real minority samples.The combination of the two aspects can well select the appropriate generated data for the training set and exclude the noise data points.Different from the traditional methods of solving data imbalance,this method not only brings more samples with large diversity and large amount of information to the small sample in the training set,but also can use the generated data replace the real sensitive minority data at the data level which guarantee the data privacy protection.These samples are structurally very similar to the real small class samples and are useful for the adjustment of the decision boundary of the classification model.On the other hand,in order to improve the quality of the generated data,this paper proposes to use the method of active learning to select more good training data for the generator.These selected better quality data will improve the quality of generated samples.Considering this situation,based on the activeG framework,a dual active learning scheme ALGAAL is proposed.This scheme improves the input training data quality of the generator with uncertainty sampling based on information entropy theory,which improves the generator model and improves the quality of the generated data.The research on unbalanced problem of infection prediction is based on the information volume and quality of the generated samples.Using the generative model and the active learning method,two frameworks for solving the problem of unbalanced data of nosocomial infections are proposed.The algorithms involved mainly include the algorithms of each base classifier,and generative adversarial networks algorithm and a strategy algorithm for the active learning.This paper not only proposes a corresponding solution to solve the problem of data imbalance,but also reduces the iteration cost of the training model by using the active learning method,so as to maximize the classification accuracy and efficiency of the model.
Keywords/Search Tags:Hospital Acquired Infection Prediction, Imbalanced Classification, OverSampling, active learning, Generative Adversarial Networks
PDF Full Text Request
Related items