Font Size: a A A

Research And Application Of Text Zero Watermark Algorithm Based On Topic And Information Entropy

Posted on:2020-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhangFull Text:PDF
GTID:2518306512487884Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology,electronic documents gradually replace traditional paper documents as the main carrier of text information,but at the same time,its easy to copy and tamper,which also bring great challenges to the copyright protection of electronic documents.Copyright authentication is a kind of behavior to judge the ownership of the carrier,and digital watermarking technology is one of the most effective copyright authentication methods.In view of the common problems of text digital watermarking schemes,such as weak text representation and low watermarking anti attack.In this paper,based on the research of existing text watermarking technology,combined with natural language processing technology,a text zero-watermarking algorithm based on topic extraction and entropy coding is proposed.Finally,the algorithm is applied to the actual copyright authentication system.The main research work includes the following parts:(1)First,a text zero-watermarking scheme based on topic and information entropy is designed,the scheme considers the semantic and statistical characteristics of text content comprehensively.And the scheme includes three parts: text preprocessing,text zerowatermarking construction and text zero-watermarking detection.(2)In the aspect of text data preprocessing,natural language processing technology is used to process the text content,and generate word set and sentence set of the full-text,then obtain the basic characteristics of all words in the word set,such as part of speech,word frequency,etc.(3)In the aspect of text zero-watermarking construction,we first obtain the text subject words and code them by the improved keywords-extraction algorithm;then calculate and code the information entropy of all sentences according to the extracted words;finally,zerowatermarking is generated and archived by encoding fusion and encryption.(4)In terms of text zero-watermarking detection,when there is a copyright dispute,first construct the zero-watermarking of the disputed text,and obtain the archived zerowatermarking content of the protected text;then we design and implement the watermarking similarity algorithm to calculate the similarity between two watermarking;finally,the copyright is determined according to the similarity results.(5)Finally,experiments including delete attack,synonym substitution attack and sentence pattern conversion attack are designed to analyze the anti attack and performance of the algorithm,the results show that the algorithm has strong anti attack and robustness;then the research results are applied to practice,and the text zero-watermarking system is realized,which provides the functions of watermarking generation,detection and historical data display.
Keywords/Search Tags:copyright-protection, text zero-watermarking, keyword extraction, information entropy, word similarity
PDF Full Text Request
Related items