Font Size: a A A

Key Techniques Of Text Ming On Criminal Cases

Posted on:2011-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:C H ChengFull Text:PDF
GTID:2178360302474660Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the rapid development of information technology, public security information system has accumulated vast amounts of business information.In the face of increasingly large amount of police security information, we urgently need AI related technologies which analysis of the data in-depth, research the laws of various kinds of information and relationships in order to better combat crime, crime prevention and control. Therefore, data mining technology is effectively applied to crime analysis is the urgent need for public security work.Text mining technology is a emerging branch of data mining for the past few years. In the massive case information, in addition to a strong degree of standardization of the database data, there are a large number of cases of narrative text descriptions. Text mining technology research and application on the massive text-case information is very meaningful.In this paper, we do some research and application of text mining technology on the massive text-cases. This paper's work includes the following:(1) In the text pre-processing aspects. Combination of practical application, this paper establishes professional police terminology thesaurus and explores the special text preprocessing method according to the feature of case text.(2) In the case feature selection aspects. According to the needs of practical applications, this paper researches the six kinds of feature selection algorithm. And by comparing the six kinds of feature selection algorithm, this paper determines the most useful feature selection algorithm to criminal text mining.(3) In the criminal-case text mining aspects, this paper proposes an improved case-texts similarity calculation method based on the cases-attribute information extracted, combined with the synonyms semantic analysis method; This paper also proposes a improve criminal text classification Of unbalanced classes method based on Naive Bayes. An improved model based on multi-variate Bernoulli model of Naive Bayes is proposed due to the unbalanced distribution of criminal case categories. (4) Design and implementation of the criminal case text mining system. This paper constructs the criminal case text mining system base on a typical C/S structure. The system implements the similar criminal case-texts retrieval module and text classification model.
Keywords/Search Tags:Text Mining, Text Categorization, Text Similarity Computing, Data Mining, Crime Data Mining, Chinese Word Segmentation, Feature Selection
PDF Full Text Request
Related items