| Scene text information plays an important role in people's daily life,such as the text on the trademark can provide commodity information,the text on the road sign can indicate the direction.Automatic detection and recognition of scene text can help people better understand the world around them and facilitate their travel.Scene text detection,as the first step in the text electronic system,plays a key role in the subsequent recognition and has been an important research problem in the field of computer vision.In recent years,with the rise and rapid development of deep learning in the field of artificial intelligence,the use of deep convolutional neural network to solve the problem of text detection in natural scenes has been successful,which has greatly improved the detection performance and application scenarios.However,deep neural network usually needs to consume huge storage and computing resources,which limits the use on mobile terminals,embedded devices and other resource-limited devices.Therefore,the acceleration and compression of scene text detection model,which can reduce the computational complexity and storage space,has broad application prospects.Aiming at the above problems,this paper accelerates and compresses of the scene text detection model from the structure,layers and parameters,and realizes the scene text detection on the mobile terminal.The work of this paper mainly includes the following three aspects:1)A detector capable of detecting scene text of arbitrary shape was built,and the amount of computation and storage of the detector was analyzed.Based on the depthwise separable convolutions,we proposed a method of lightweight design for each module of the detector,and thus obtained a lightweight scene text detection model suitable for mobile terminal.2)For the lightweight scene text detection model,channel pruning and low-bit integer quantization are further adopted to perform acceleration and compression.In channel pruning,we adopted two different pruning methods and made a comparative analysis of their respective applicability.In the quantization experiment,we adopt the method of linear symmetric quantization,and propose the truncation method of floating point number range and the quantization layer fallback strategy,to balance the detection performance of the quantized model with the requirement of acceleration and compression.3)By combining a series of engineering optimization methods with accelerated and compression algorithms,we developed and tested the forward of the model on mobile terminal.On the ICDAR 2015 data set,the f-score value reached 78.71%,the detection time of each picture on the mobile terminal was 749.15 ms,and the model size was 517 KB,which met the needs of mobile terminal.When the input image size is reduced,the detection time can be further reduced to 426.41 ms.By encapsulating the model and code,we developed a mobile application based on the android platform and tested it in real life scenarios. |