Font Size: a A A

Design And Implementation Of Crime Analysis And Prediction System Based On Machine Learning

Posted on:2022-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShiFull Text:PDF
GTID:2506306557468044Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the economy and urbanization,public security agencies are facing the challenge of the explosive growth of data,especially as the development of the Internet of Things in recent years has become more and more mature,resulting in more and more serious data collection,analysis and processing problems faced by public security.Public security organs are often in a passive position in solving public safety data analysis and prediction when facing these problems and challenges.How to fully excavate valuable information from public safety data and effectively predict criminal behaviour is the difficulty and key to improving the work efficiency of the public security system and reducing public safety risks.In response to this problem,this thesis aims at crime prediction,based on Hive data warehouse,Spark computing engine,and using machine learning methods to design and develop a crime analysis and prediction system based on Spark.The main work of the thesis is as follows:(1)For data analysis and prediction problems,the decision tree algorithm,Naive Bayes algorithm and Kmeans algorithm in machine learning are studied,the core ideas of different algorithms are analyzed,and the related optimization algorithms of the algorithm such as ID3 algorithm and C4 are studied.5 Algorithm,Laplacian smoothing and Kmeans++ algorithm.The pros and cons of the algorithm are analyzed,and the accuracy of the algorithm is compared through simulation experiments.On this basis,the C4.5 algorithm,Laplace smoothing and Kmeans++algorithm are selected as a crime prediction algorithm.(2)Based on theoretical research,designed and implemented a criminal behaviour analysis and prediction system.The main functions of the system include historical data and real-time data collection,data storage,data query and crime prediction.Among them,the use of Datax regularly extracts historical data to achieve the collection of historical data;based on the Flume+Kafka+Spark Streaming framework,the collection,acquisition and parallel processing of real-time data are completed;in terms of data storage,the Hadoop+Hive framework is used to collect The historical data obtained is stored in Hive;the data query adopts the framework of Hive+Presto,and Presto is used as the query engine,which greatly shortens the response time of the query;at the same time,it is based on the decision tree algorithm and naive Bayes algorithm in Spark MLlib And Kmeans algorithm,designed and constructed a data model based on the Hadoop+Spark framework,and classified and predicted the collected historical data and real-time data to realize the functions of personnel identity prediction and crime type prediction.Theoretical analysis and data test results show that the crime analysis and prediction system designed and developed in this paper can effectively analyze and predict crimes based on historical data and real-time data collection,storage,query,and analysis.The results of analysis and prediction are relatively high Reliability and accuracy.
Keywords/Search Tags:Crime prediction, machine learning, decision tree, Naive Bayes, Kmeans, Spark, Hive data warehouse
PDF Full Text Request
Related items