A Study Of Automatically Classifying Text-Described Criminal Behaviours Based On A Hybrid Learning Model

Posted on:2021-06-27

Degree:Master

Type:Thesis

Country:China

Candidate:X Yang

Full Text:PDF

GTID:2506306452999389

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,the country has carried out law popularization operations many times in order to let the people know the law and understand the law,in order to reduce the crime rate.It will be a long and difficult process to rely on human resources to popularize the law.At present,many people involved in the case understand that the case is to seek help from professional legal personnel.These legal popularization processes are a repetitive and simple job for professionals.So,using artificial intelligence to assist it has become a general trend.Due to the limitations of technology and equipment,artificial intelligence cannot completely replace lawyers and judges and so most artificial intelligence systems in the legal field are supplementary.In order to achieve the purpose of popularizing the law,reducing the workload of professional legal personnel and assisting in handling cases,this thesis develops a criminal behaviour classification system.It first pre-processes various Chinese criminal behaviour texts and uses multiple single learning models to predict the crime category of certain criminal behaviour.Then it incorporates keywords to distinguish confusing classification of criminal behaviour.Finally,it integrates multiple learning models to adjust the weights to obtain the final prediction results.The main contributions of this thesis are as follows:1.In the traditional process of text feature value selection,a feature filtering method based on word embedding is used to solve the problem of large word vector dimension and sparse vector matrix.The traditional approach uses the whole words of the train-ing set as features to construct a word vector space,and uses the word vector space generated by the training set to convert text information into digital vectors during testing.In this thesis,before using the training set to construct the word vector space,we use the word embedding method to obtain the filtered vocabulary,and the new vocabulary to construct the word vector space.During training and testing,we use the word vector space to convert the original data into digital vectors,and use TF-IDF to obtain the weight matrix in the calculation of the classification model.After pro-cessing,the dimension of the weight matrix is reduced by one third,and the problem of vector sparseness is eased.2.Integrate multiple classification models to improve the classification accuracy.We assign a weight to each model and accordingly fuse them to get the final result.different emphasis,assign weights to the models,adjust The weight of a single model reflects the proportion of the results of the model in the overall results.We experiment to find the optimal solution weight distribution.The fused model is better than any single model in terms of classification accuracy.3.We sse Text Rank to obtain the keywords of the charges to remove the confusion in criminal classification.It is not easy to distinguish some confusing crimes by merely using the classification model.So we add crime keywords to distinguish and verify confusing crimes.Specifically,we use Text Rank to obtain a keyword list for each crime,compare the keyword list of the confusion team,and use words that are not common to the two as the keyword list to distinguish confusion.Moreover,we study the criminal law’s qualitative words that are easy to cause confusion in classifying crimes and verify and modify the keyword list obtained.We find that incorporating keywords and rules can effectively remove confusing criminal behaviour classification.In summary,the pre-processing and post-processing of data can improve the accuracy of criminal behaviour classification.The idea of these processes may also be useful in Chinese text classification in other fields to improve their classification accuracy.

Keywords/Search Tags:

Artificial Intelligence, Natural Language Process, Machine Learning, Convolutional Neural Network

PDF Full Text Request

Related items

1	Computer Assisted Sentencing Based On Convolutional Neural Network
2	Research On Artificial Intelligence’s Fair Use In The Process Of Machine Learning
3	Research On The Image Intelligence Target Detection And Scene Recognition Technology Based On Convolutional Neural Network
4	Research And Application Of Legal Intelligence Based On Deep Learning
5	Local Credit Evaluation System Based On Internet Information
6	Age Estimation Based On Convolutional Neural Network
7	Research On The Application Of Fair Use System In Machine Learning
8	Design And Implementation Of A Matching System For Similar Legal Cases
9	The Copyright Risk Of Machine Learning In The Era Of Artificial Intelligence And Its Solution
10	Detection And Analysis Of Public Security Events Based On Convolutional Neural Network