Font Size: a A A

Research And Implementation On Rapid Similarity Detection Technology Of Enterprise Electronic Documents

Posted on:2016-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:J W MaFull Text:PDF
GTID:2308330461476426Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Enterprise Electronic Documents are important resources for enterprises, including enterprise contracts, project documents and issued patents etc. In today’s fierce competition of knowledge, the leakage of core electronic documents may cause economic loss, loss of reputation, and even the dangerous situation of enterprises. Thus, enterprises pay more and more attention to electronic documents security protection and management.In this paper, we discuss methods about how to prevent the leakage of core electronic documents, and point out the usually neglected part-- process documents which is produced in the formation process of enterprise electronic documents. These different versions of the documents are produced in the process of preparation, review, revise, submission in many employees’ computers. This can easily lead to leakage problem which is often ignored by the current enterprise documents management system.This paper implements a fast similarity detection system for Enterprise Electronic Documents based on the problem of the documents detection mentioned in the above process. This system is based on COPS model which takes use of the idea of level based text matching. Concretely, it selects digital fingerprints with different size of text blocks for documents similarity calculation. First, we use the provided core documents to establish encryption documents sample database. Second, we use similarity detection system to detect encryption process documents from employees’ computers. Finally, we send the detection reports to the enterprise documents management system for later processing. This above procedure would greatly narrow the distribution range of encryption electronic documents, thus improve the security level of enterprise core documents. Experimental results and enterprise feedback information show that the system has the high detection speed and accuracy.
Keywords/Search Tags:electronic documents, similarity detection, text blocks, level based textmatching
PDF Full Text Request
Related items