Font Size: a A A

Analysis And Researches On Structured Relations For Image Recognition And Graphical Layout

Posted on:2022-07-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y T QiangFull Text:PDF
GTID:1488306725971089Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Thanks to the advancement of methodologies and models,the enhancement of computing power,and the continuous improvement of standard open data sets,machine learning,especially computer vision,has increasingly become one of the most important and most concerned research fields in computer science.How to make computers reach or even surpass human intelligence on related issues has aroused widespread research interest.Human beings are good at finding connections between different subjects to understand,analyze and transform the real world.This thesis focuses on the representation and employment of structured relationships between research objects to improve machine learning models for image recognition and academic poster layout.We first studied how to use the comparison relationship between images for few shot image classification.Comparing two different images to determine whether they fall into the same category can help improve the accuracy of an image classification model,especially if there are insufficient training samples to train a reliable model.Metric Learning is based on this idea and is one of the most important methods for few-shot image classification.This thesis studies the metric learning strategy based on hierarchical comparison.Its main purpose is to compare image feature pairs at different levels of the convolutional neural network to make the deep learning model more reliable.In order to further improve the effectiveness of our model,we also proposed to add random Gaussian noise to feature layers to alleviate the problem of overfitting and enhance the robustness of the model.Extensive experimental results verify the effectiveness of our proposed model,and the analysis and experiments also verify that the results of metric learning based on hierarchical comparison are heterogeneous,and can mutually improve the classification accuracy.Our research shows that the comparison between different images is useful for few-shot image classification,based on this,we further studied the recognition of visual relationship in a image.We use a triplet <object1-relation-object2> to represent a ternary visual relationship,which describes the objects in the image and the relationship between two objects.Considering the triplet form of visual relationship and the correlation between different visual relationships,we proposed to use a third-order tensor to represent the label space of visual relationships.Besides,different from many other image classification networks which uses a fully connected layer to classify the extracted image features,we proposed a tensor composition layer for visual relationship classification.The tensor representation strategy and the tensor synthesis layer can not only greatly reduce the number of parameters of the model,but also capture the relationship between different visual relationships,thereby improving visual relationship prediction in general.In order to verify the effectiveness of our proposed model,we conducted experiments to test our Tensor Composition Net(TCN)for both visual relationship prediction and Relation-based image retrieval.Our model is compared with existing multi-label image classification methods and visual relationship detection methods.Experimental results show that our method is more accurate in identifying visual relationships.This thesis further studies how to use structured relationships to assist graphical design.More specifically,we use the tree structure to model the relationship between different graphical elements and further solve the problem of scientific poster layout generation.Unlike many existing research work which aims to generate images at pixel level,this thesis studied poster layout generation which concerns more on the structured arrangement of layout elements.For this task,we first manually collected85 pairs of scientific papers and poster and labeled their layout information,and then we trained a Bayesian model based on the labeled scientific poster layout data.Scientific posters are usually composed of several panels.In this thesis,we proposed to use a binary tree to model and represent the structural relationship of different panels,and designed a recursive scientific poster partitioning algorithm based on this tree structure representation.We compare our method with other classical machine learning methods and human designed posters,and both quantitative and qualitative results demonstrate the effectiveness of our method.In summary,this thesis studied structural relationships of different research factors in practical machine learning problems.Specifically,we explored the binary,ternary and tree-structured relationship representation together with machine learning algorithms for image recognition and scientific poster generation.
Keywords/Search Tags:Image Classification, Layout Generation, Few-shot Learning, Visual Relationship, Tensor Decomposition
PDF Full Text Request
Related items