Font Size: a A A

Domain Adaptive Algorithm For Document Object Detection

Posted on:2024-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:J L XiangFull Text:PDF
GTID:2568307079459754Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an important carrier of information,document is the main presentation form of key and important information in all industries and fields.As an important step and method of document image informatization,document object detection is of self-evident importance.The significant domain differences between document images and natural images mean that it is not appropriate to apply the progress made in object detection of natural scene images directly to document images.The traditional document object detector will show serious performance degradation when faced with document images from different fields,and there are many limitations,including small appearance difference between different categories such as title and text,and unbalanced number of objects in different categories.Focusing on the above problems,this paper proposes the framework of "CDDOD-DA"to introduce the domain adaptation method into the document object detection task,alleviates the domain differences through the multi-level strong and weak mixed alignment,and innovatively introduces the texture difference regularization module and classification consistency regularization module,which is used to improve the distinguishing ability of difficult samples and enhance the matching of key areas and important instances.Through a series of ablation experiments,the validity of the proposed domain adaptation algorithm for document object detection is proved.The main research contents of this paper are as follows:1.Based on the classic target detection framework Faster RCNN model,in the scenario where domain differences exist in documents of different languages and different layout styles,this paper uses training domain classifier to carry out antagonistic learning,and reduces domain differences between source domain and target domain through multilevel mixed strong and weak alignment.This scheme can effectively learn the knowledge of document objects in the source domain and transfer it to the document data in the target domain.The generalization performance and robustness of the original document object detection model are significantly improved.2.In view of the similar texture,color and other features among different kinds of document images and the small differences between classes,this paper innovatively constructs a packet generator with the help of Multiple Instance Learning.The "max-max"sort loss function is designed to obtain a reliable relative comparison between the most likely positive instance and the most difficult negative instance,and to distinguish the difficult categories to improve the discrimination ability of the model’s positive and negative samples.At the same time,the difference between classes with different semantic information but similar texture characteristics in the package is learned to reduce the possibility of misclassification.Further improve the accuracy of the original document object detection model.3.In view of the unbalanced number among the categories of document objects and the limited learning information of some important instances,this paper further improves the algorithm by cross-domain matching of key areas and important instances of document images,and proposes a classification consistency regularization module to automatically find instances that are difficult to align in the target domain.By adding detection headers to the image level,sparse but critical regions are obtained.Combined with the prediction results of the classifier instance level,category consistency regularization is carried out,which further improves the cross-domain matching ability of the document object detection model.
Keywords/Search Tags:Document Object Detection, Domain Adaptation, Multiple Instance Learning, Categorical Consistency Regularization
PDF Full Text Request
Related items