Font Size: a A A

Digital Watermark Of Document Image Removal Technology And Similarity Test

Posted on:2022-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:L L HuFull Text:PDF
GTID:2518306572497514Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,more and more documents are widely spread on the Internet,which will cause copyright security problems.The visible watermarks are used in multimedia information to provide effective copyright protection.Meanwhile,in order to evaluate the anti-attack and robustness of watermark,the watermark attack algorithms emerged as the times require.However,most of the research on visible watermarks focuses on images,document watermark's research has come to a standstill since the end of 20th century.Therefore,this paper mainly research on document watermark removal and near-duplicate document image matching algorithms to provide support for copyright protection.The main work as follows:(1)In order to effectively remove the document image watermark without affecting the text,an improved U-Net framework is proposed.The framework adds a spatial attention mechanism and a channel attention mechanism to the U-Net encoding stage.The output of the encoding stage is sent to the improved hollow convolutional pyramid(ASPP).This improvement enables the model to strengthen the document watermark feature and focus on the document watermark area to further improve the accuracy of watermark recognition,improving the model's receptive field at the same time;the decoding stage adds an improved spatial attention mechanism based on the channel attention mechanism to better using the spatial relationship between different feature maps,focusing on the feature space area related to the watermark.(2)A similarity detection algorithm for near-duplicate documents based on perceptual hashing and Siamese network is implemented for similarity test.The near-duplicate document detection algorithm based on perceptual hashing performs database image retrieval on the document image after removing watermark,finding the picture with the highest similarity to provide auxiliary document image similarity evaluation;the Siamese twin network-based near-duplicate document detection algorithm uses twins The network can evaluate the characteristics of the similarity of two inputs.The traditional convolution system is added to the original Siamese twin network architecture to expand it to improve the accuracy of detection.(3)The document image watermark dataset is collected for the test of document image watermark removal and similarity test.Experiments show that the proposed improved U-Net framework can effectively remove the document image watermark and retain the text information of the document with high precision.Compared with other models,this model has achieved better visual effects and evaluation indicators.The similarity results calculated by the Siamese twin network and the perceptual hash algorithm provide effective support for copyright protection.
Keywords/Search Tags:Document watermark, U-Net, Attention mechanism, Siamese twin network
PDF Full Text Request
Related items