Font Size: a A A

The Research And Implement Of Text Classification Technology Based On Wikipedia In Mobile Phone Forensics

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:S R JiangFull Text:PDF
GTID:2348330545955594Subject:Information security
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of intelligent mobile phone and mobile Internet,mobile phone gradually changed from communication tools to the essential equipment in people's lives.In addition to basic telephone communication,people can also make online shopping,travel booking,mobile payment and so on by mobile phone,which greatly increases the convenience of life.However,with the mobile phone becomes more and more important,the mobile phone as well as be used in illegal and criminal activities.In the face of the huge data resources in the mobile phone,how to quickly obtain evidence of the mobile phone has become the most important.At the same time,most of the information stored in the mobile phone is short text,but the traditional text classification method has poor performance in the short text classification.In particular,with the improvement of instant messaging software,a large number of criminals choose to use instant messaging software to communicate and coordinate.In order to avoid the supervision and tracing of public security organs,criminals often use the way of avoiding key words to communicate,such as synonyms,insert spaces,and Pinyin instead.Therefore,how to trace the avoidance of this kind of avoidance has also become the research goal of the mobile phone forensics.In order to solve this problem,this paper do the main work as follows:This paper firstly studies the mobile phone forensics related research,understand the concept of mobile phone forensics,analyzes the source of the data,and according to the actual application of mobile phone forensics now summed up the principles and standardization process of mobile phone forensics.The storage mechanism of Android system is analyzed,then the mobile phone mail list,call records,SMS and instant messaging applications using the SQLite database for analysis,analysis of the data file structure of the database,provides a guarantee for data extraction.And then the relevant database files were read and analyzed,and the data were read in a standardized format,in order to facilitate the next step of text categorization.Then this paper studies the text classification technology,analyzes the related process of text classification,and analyzes the characteristics of the evidence in the mobile phone forensics.Because of its short text characteristics,there are sparse features in text classification.Therefore,the concept of feature expansion is introduced to solve the problem of short text classification in mobile phone forensics.Based on the characteristics of the existing Wikipedia feature expansion,we find the existence of ambiguity problem,it introduces the noise in text classification.So we propose TF-ITF algorithm to eliminate the noise in order to improve the accuracy of text classification algorithm.Based on this,in order to solve the problem of some words are not in Wikipedia,we put forward to the Baidu search based feature expansion algorithm to improve the recall.Finally the Wikipedia and Baidu search based feature expansion algorithm WBFE is proposed,and the text are classified based on WBFE.
Keywords/Search Tags:mobile forensics, text classification, featuer expansion
PDF Full Text Request
Related items