Font Size: a A A

Design And Implementation Of The Regulatory System Of The Enterprise Mail

Posted on:2013-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:L Y QianFull Text:PDF
GTID:2218330371459956Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, with the developing of the corporations'informationization, Email is playing a more and more important role in corporations; it makes the staff in corporations extremely convenient for their daily communication and sharing information. As a result, the corporations'important information security problem is increasingly prominent, the problem of how to protect the corporations'information security and at the same time do not affect the staff's daily communication needs to be tackled urgently.In this paper, by researching the current solution of image characters recognition and Chinese Words Segmentation and integrating the Bayesian Filter Algorithm, the author implements an Email content supervision system which is good for enterprise application. The system deploys between Email client and SMTP server, transmitting and filtering the Email content, The Email content includes text mail body and picture mail attachment. For the picture attachment, the system recognizes the picture's text information and transfers to text. The system segments the text words by using the Chinese Academy of Science's ICTCLAS Chinese Words Segmentation system, and the segmentation results will be sorted by the part-of-speech and word frequency, at last selects parts of the results as the Bayesian Algorithm's tokens. The Bayesian Algorithm will be implemented as Bernoulli model and Polynomial model, and the user may select specific model in the filter processing. Meanwhile, the system uses the method of combining the common stop words table and the special stop words table in removing of stop words, and the user will maintenance the special stop words table.The system mainly includes Email's forwarding agent module, content extraction module, content preprocessing module, content filtering module, parameter configuration module and log module. System testing in a simulated environment got a good effect. It has a wide range of applications.
Keywords/Search Tags:Bayesian, Chinese Words Segmentation, image characters recognition, Email agent
PDF Full Text Request
Related items