Font Size: a A A

Chat Mining For Authorship Verification

Posted on:2011-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:B XuFull Text:PDF
GTID:2178330338489573Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet, it has become one indispensable part in our life. At the time that it brings us great convenience, and the network security is paid more and more attention. Instant messaging, as an important branch of the network, naturally becomes the object that many hackers and criminals make use of and attack. With the Vagueness of the identity, some people with ulterior motives have the chance to steal or fraudulently use other's account to publish some malicious link or fraudulent information to get the unlawful interests in the process when we communicate with our friend by the instant messenger.The traditional authorship authentication is suitable to solve the problem with the long texts as the training data, such as the copyright of the work or the development of the fraud management system. However, the shortness of the chat message and the difference of corpus between Chinese and English make it difficult to apply the old method to this problem.This paper referred some method in dealing with the authorship identification. And also, it makes some improvements as follows:Feature selection. We applied the way of comparison to make the pointed selection of the features which is different to others. And also, we select some feature on the field of instant messaging, such as emoticon and some network catchwords.Chinese corpus. N-gram is frequently used to deal with the English corpus. However, if we apply this method to the chinese corpus directly, it will not have the positive impact on the classification because the features are very sparse. In this paper, we change the chinese grapheme into the other form so as to solve this problem.Classification method. SVM has been proved to be the best classification method in authorship identification. And also, some researchers continue to use this method to solve the authorship verification. Here, we improve the backpropagation algorithm so that it can deal with the one-class classification problem.
Keywords/Search Tags:Instant messaging, authorship verification, one-class classification, improved backpropagation algorithm
PDF Full Text Request
Related items