Font Size: a A A

Statistical and machine learning techniques for dealing with missing data in criminal justice: A simulation and comparison of missing data methods

Posted on:2013-04-28Degree:Ph.DType:Dissertation
University:Sam Houston State UniversityCandidate:Hill, JoshuaFull Text:PDF
GTID:1450390008466901Subject:Sociology
Abstract/Summary:
Dealing with missing data has been a continuous problem within the context of the social sciences and more specifically, criminal justice. While rarely talked about, missing data can bias results as well as influence model efficiency. Currently, there is only a very small literature of criminal justice specific research on missing data. The goal of this dissertation is to remedy, in part, this lack of attention to an important topic. The analysis within examines the use of eleven frequently used imputation techniques, including both classical statistical techniques as well as newer, algorithmic techniques. Using an advanced simulation methodology, the dissertation examines both the imputation of missing values, as well as the impact of those imputed datum on substantive analysis. Additionally, it seeks to develop a user-friendly package for the program R to assist researchers with the imputation of missing data.;KEY WORDS: Machine learning, Missing data, Listwise deletion, Random Forests, Hot deck imputation, Multiple imputation.
Keywords/Search Tags:Missing data, Machine learning, Criminal justice, Techniques, Imputation
Related items