| In recent years,with the continuous development of deep learning,more and more technologies have been applied to intelligent transportation.As an essential mark to identify unnatural changes to the frame or engine,the VIN(Vehicle Identification Num-ber)plays a vital role in vehicle inspection.The single-character detection and recog-nition of VIN can effectively assist the vehicle management office in inspecting the ve-hicle.At the same time,deep learning has made significant achievements in the fields of scene text detection and recognition and weakly supervised learning.How to use the existing technology to complete the auxiliary inspection of VIN,improve the efficiency of the vehicle inspection,reduce the inspection cost of the vehicle management office,and build a complete vehicle inspection system are of great practical significance.Since most of the existing text detection and recognition technologies treat text lines as the primary processing unit,there is little research on the end-to-end detection and recognition of single characters.To this end,this paper has researched the detection and recognition of single characters of VIN,and its main work is as follows:1.A feature expression method for VIN in actual scenes is designed,and a cas-caded multi-task feature extraction structure is designed for these representations.The feature expression method is used to realize the encoding of the text position,reading order,single character position,and semantic information of VIN.The cascading fea-ture extraction structure is used to realize the extraction of different features,reducing the task difficulty of different branches.The experimental results show that the fea-ture expression method and cascaded multi-task structure proposed in this paper can effectively realize the encoding and extraction of VIN information.2.A weakly supervised learning framework that doesn’t need character-level anno-tation is proposed to achieve low-cost training for the single-character detection branch and semantic recognition branch.Single-character detection and recognition require character-level annotation,which requires a massive cost for neural networks that rely on large amounts of data.According to the characteristics of VIN,the paper proposes a weakly supervised learning method that only requires text line annotation box and string annotation.By comparing the sequence information of the predicted string and the la-bel string,iteratively completes the estimation of the character-level pseudo-labels.The loss function introduces a confidence map to realize the training of character position representation and semantic recognition representation using incomplete pseudo-labels.Experiments show that compared to the complete character-level annotation,the weakly supervised learning method proposed in this paper can reduce the labeling cost by 17 to 19 times. |