Font Size: a A A

Prohibited Information Monitoring Of E-commerce Based On Classification

Posted on:2013-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:X F ChenFull Text:PDF
GTID:2218330371458955Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development and popularization of the Internet, e-commerce has become very important in the production and living. Taobao, Alibaba carry out tens of thousands of transactions every day, it will inevitably lead to a lot of prohibited information. If this is not effectively dealt with, it will cause damage to development of e-commerce. As a result, review and filtering must be strictly executed before publishing information to web. Meanwhile, as the explosion of information, manual review becomes unrealistic, it's urgent to use computer to as a high effective method to monitor prohibited information.This paper analyses the information monitoring mechanism and then uses text classification and information retrieval technology to automatically identify prohibited information, thereby reducing the burden of manual review and improving the efficiency. This paper argues that information monitoring includes prohibited keywords and content judgment. How to extract prohibited words and determine prohibited content is the key to success. This paper introduces the feature selection and text classification. For prohibited keywords, this paper uses feature selection and polarity criterion. For prohibited content, this paper proposes a vector space model based on structural features of text to achieve more satisfactory results during the classification. This paper implements the monitoring of prohibited content of e-commerce based on the SVM model, and use the maximum entropy model as a comparison.This paper use large amount of true data from e-commerce website to carry out experiments, and the results show that e-commerce information monitoring algorithms based on prohibited keywords and content arc useful for information monitoring, thus demonstrating the feasibility to automatically determine prohibited information by machine learning.
Keywords/Search Tags:E-Commerce, Prohibited Keywords, Prohibited Content Monitoring, Information Retrieval, Improved Vector Space Model, Text Classification
PDF Full Text Request
Related items