Font Size: a A A

Deep Learning Based Scenic Image Sensitive Text Information Detection

Posted on:2022-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:J Y HouFull Text:PDF
GTID:2518306326483414Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Based on the speed of the development of the Internet today,an exponential increase trend is observed in the number of users of social networks and the amount of data generated by users.However,social networks have not only broadened the channels for netizens to obtain information,but also provided opportunities for criminals.At present,due to the convenience of the Internet,the number of criminals carrying out illegal and criminal activities through the Internet is increasing day by day,and the harmony and stability of the society is seriously damaged by these criminals who spread pornography,violence,reactionary and other information wantonly through the Internet.Sensitive written information is monitored and suppressed in real time by the government due to the promulgations of relevant policies.However,in order to escape the supervision of the Internet,criminals now usually disseminate sensitive information by embedding sensitive words in scene pictures.Aiming at the situation that it is difficult to recognize sensitive characters embedded in scene pictures,this paper proposes a set of methods to detect sensitive information in scene pictures.Aiming at the difficulty of text localization of scene images,an improved SSD text localization algorithm is developed in this paper.By adding RFB network to increase the receptive field,the accuracy of the network to the text location in the image is improved.In addition,due to the replacement of backbone network VGG and lightweight Mobile Net network structure,the algorithm developed in this paper achieves 73.1% of the comprehensive index F-Measure in ICDAR 2017 dataset,and the effect of text location is good.The CRNN-based word recognition algorithm is applied to the recognition of text modules in scene images.Firstly,feature vectors are extracted from the input images based on multilayer convolution layer and maximum pooling layer.Then,the cycle layer module is constructed to realize the label probability distribution of the image receptor field corresponding to the prediction feature vector sequence.Finally,the time series data are classified in the transcription layer.The recognition accuracy of the algorithm developed in this paper reached85.1% in the test of Chinese Street View dataset.A two-level filter is designed to detect sensitive semantics.The first level filter uses the method of building dictionary tree to match the sensitive words.If only the method of keyword matching is used to determine the sentence-sensitive information,there will often be omission and misdetection.On this basis,the space vector model is constructed,and the weight of feature keywords is assigned based on TF-IDF algorithm..Finally,sensitive semantic judgment model is trained by SVM classifier to realize the sensitive semantic judgment of target sentences.
Keywords/Search Tags:Deep learning, Text recognition and location, Sensitive semantic detection, Social network images
PDF Full Text Request
Related items