Font Size: a A A

Research On Forecasting The Risk Classification Of Prisoners

Posted on:2022-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2506306602990499Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
Prison is a place to educate and reform the prisoners.However,in recent years,dangerous events such as prison break and suicide have taken place recurrently at home and abroad.While this brings huge losses to prisons and society,it may also threaten people’s lives and national security.In order to improve the standardized and safety construction of prisons,the risk management of prisoners has become a crucial problem in the construction of prisons.How to make good use of the massive data and mine the effective information has become the most important core problem.At present,in the research of various problems of prisons,scholars have basically adopted machine learning or deep learning technology to dig out potential laws from a large number of information and carry out in-depth research.Based on the research goal of prison risk level identification and the key difficulty of the unbalanced characteristics of prison data,a hybrid sampling risk identification algorithm(MWMOTE-ENN)is proposed from the aspect of data sampling,which is based on oversampling and under-sampling.Besides,an improved risk identification algorithm based on the integration is proposed from the aspect of algorithm improvement.For imbalanced multi-category data,a hybrid sampling method is proposed from the perspective of data level: MWMOTE-ENN,which combines oversampling method based on clustering idea and under-sampling method.The MWMOTE-ENN method is used to sample the prison data,so that the sample size of the categories reaches a relatively balanced state.The problems of uneven distribution of minority classes and fuzzy boundaries are solved.Then the AdaBoost classification algorithm is used to train the data and predict the risk level.In order to verify the effectiveness of the risk prediction algorithm based on MWMOTE-ENN,this thesis also compares with under-sampling,over-sampling and mixed sampling methods.Indexes such as recall rate,precision,F-value and G-mean value are used as the evaluation criteria.The results of our experiments demonstrate that the risk classification algorithm based on MWMOTE-ENN is superior to several other sampling algorithms in overall classification effect and recall rate of each category.The recognition rate of the three risky levels has been increased by about 20 percentage points.Aiming at the problem of class loss differences caused by unbalanced data,a revised AdaBoost algorithm(R-AdaBoost)is proposed.By adjusting the initial weight and error function of the minority class,and introducing the imbalance factor and misjudgment factor into the weight of the classifier,a classifier that is friendly to the minority class can be selected.It is beneficial to reduce the error rate of risky samples.For the sake of the effect of the R-AdaBoost algorithm,this thesis uses 5-fold cross-validation to compare the classification effect of the R-AdaBoost algorithm and the AdaBoost algorithm from two perspectives of equilibrium and imbalance.The experimental results show that:(1)The classification effect of the two algorithms in balanced sampling is better than that of unbalanced scenes.(2)In the balanced scenario,the overall effects of the two algorithms are equal,but the R-AdaBoost algorithm is slightly better in the recall rate and misjudgment rate of minority categories.(3)In the unbalanced scene,with the increase of imbalance degree,the recognition ability of the R-AdaBoost algorithm is better than the original algorithm.This shows that the R-AdaBoost algorithm effectively improves the predictive ability of risk category level,reduces the missed judgment rate of three risky categories.The classification effect of the R-AdaBoost algorithm is significantly better than the original algorithm in the unbalanced situation.
Keywords/Search Tags:Risk Level, Prison, Unbalanced Multi-classification, Revised AdaBoost, Mixed-sampling
PDF Full Text Request
Related items