Font Size: a A A

Image Sensitive Text Detection System Based On Heterogeneous Computing

Posted on:2019-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:H PengFull Text:PDF
GTID:2348330569987674Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In the current network environment,there are a large number of sensitive information in the form of text,pictures and videos.With the implementation of the government's "Net Action",the dissemination of sensitive information in plain text has been effectively curbed.Many criminals use sensitive images embedded in pictures to spread sensitive information.Sensitive text detection in the current image has the problems of high detection difficulty and low detection efficiency.Based on the above considerations,this paper designs a picture-sensitive text detection system based on heterogeneous computing.The system can perform image acquisition,image text positioning,image text recognition and sensitive semantic detection in the data source through offline and online methods.Among them,in order to solve the problem of text positioning in a complex scenario,an end-to-end deep network architecture based on a regional recommendation network RPN and bidirectional recurrent neural network GRU is used.In the text recognition process,in order to improve the system's detection robustness,a two-layer text recognition module was designed.The first layer of character recognition module identifies most of the text through the CNN,deep bidirectional GRU network and CTC network structure.For the pictures with low first-level text recognition ratings,the second-level text recognition processing was performed using the open source engine Tesseract.In sensitive semantic detection,this paper designs two layers of sensitive semantic filters.The first layer filters coarsely filter sensitive words by using the prefix tree.The second layer filters deep semantically sensitive semantic filtering using Chinese word segmentation,word bags,and SVM classifiers.In order to solve the huge number of network images and the long processing time of pure software,this paper chooses the FPGA-based heterogeneous computing system to implement the system and accelerate the key algorithm according to the types of system algorithms,parallelism and power consumption.Through the OpenCL framework,task allocation and scheduling are performed in a heterogeneous system,and the FPGA-side acceleration kernel accelerates the time-consuming parallelism of the system.The test results show that the detection accuracy of the system can be up to 95% when the system is tested on a web page.With the picture as a unit,the processing speed can reach about 1.4s/sheet.Compared with the CPU solution,the FPGA solution in this paper has nearly 6 times improvement in processing speed and nearly 37 times improvement in energy efficiency.This article can meet the requirements of the validity and timeliness of image-sensitive text detection.
Keywords/Search Tags:picture-sensitive text detection, machine learning, heterogeneous computing system, OpenCL
PDF Full Text Request
Related items