Analysis And Researches On Structured Relations For Image Recognition And Graphical Layout

Posted on:2022-07-17

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y T Qiang

Full Text:PDF

GTID:1488306725971089

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Thanks to the advancement of methodologies and models,the enhancement of computing power,and the continuous improvement of standard open data sets,machine learning,especially computer vision,has increasingly become one of the most important and most concerned research fields in computer science.How to make computers reach or even surpass human intelligence on related issues has aroused widespread research interest.Human beings are good at finding connections between different subjects to understand,analyze and transform the real world.This thesis focuses on the representation and employment of structured relationships between research objects to improve machine learning models for image recognition and academic poster layout.We first studied how to use the comparison relationship between images for few shot image classification.Comparing two different images to determine whether they fall into the same category can help improve the accuracy of an image classification model,especially if there are insufficient training samples to train a reliable model.Metric Learning is based on this idea and is one of the most important methods for few-shot image classification.This thesis studies the metric learning strategy based on hierarchical comparison.Its main purpose is to compare image feature pairs at different levels of the convolutional neural network to make the deep learning model more reliable.In order to further improve the effectiveness of our model,we also proposed to add random Gaussian noise to feature layers to alleviate the problem of overfitting and enhance the robustness of the model.Extensive experimental results verify the effectiveness of our proposed model,and the analysis and experiments also verify that the results of metric learning based on hierarchical comparison are heterogeneous,and can mutually improve the classification accuracy.Our research shows that the comparison between different images is useful for few-shot image classification,based on this,we further studied the recognition of visual relationship in a image.We use a triplet <object1-relation-object2> to represent a ternary visual relationship,which describes the objects in the image and the relationship between two objects.Considering the triplet form of visual relationship and the correlation between different visual relationships,we proposed to use a third-order tensor to represent the label space of visual relationships.Besides,different from many other image classification networks which uses a fully connected layer to classify the extracted image features,we proposed a tensor composition layer for visual relationship classification.The tensor representation strategy and the tensor synthesis layer can not only greatly reduce the number of parameters of the model,but also capture the relationship between different visual relationships,thereby improving visual relationship prediction in general.In order to verify the effectiveness of our proposed model,we conducted experiments to test our Tensor Composition Net(TCN)for both visual relationship prediction and Relation-based image retrieval.Our model is compared with existing multi-label image classification methods and visual relationship detection methods.Experimental results show that our method is more accurate in identifying visual relationships.This thesis further studies how to use structured relationships to assist graphical design.More specifically,we use the tree structure to model the relationship between different graphical elements and further solve the problem of scientific poster layout generation.Unlike many existing research work which aims to generate images at pixel level,this thesis studied poster layout generation which concerns more on the structured arrangement of layout elements.For this task,we first manually collected85 pairs of scientific papers and poster and labeled their layout information,and then we trained a Bayesian model based on the labeled scientific poster layout data.Scientific posters are usually composed of several panels.In this thesis,we proposed to use a binary tree to model and represent the structural relationship of different panels,and designed a recursive scientific poster partitioning algorithm based on this tree structure representation.We compare our method with other classical machine learning methods and human designed posters,and both quantitative and qualitative results demonstrate the effectiveness of our method.In summary,this thesis studied structural relationships of different research factors in practical machine learning problems.Specifically,we explored the binary,ternary and tree-structured relationship representation together with machine learning algorithms for image recognition and scientific poster generation.

Keywords/Search Tags:

Image Classification, Layout Generation, Few-shot Learning, Visual Relationship, Tensor Decomposition

PDF Full Text Request

Related items

1	Visual Saliency Detection Via Tensor Decomposition
2	Low-rank Tensor Representation Learning Methods And Applications
3	Research On Methods And Theory Of Tensor Learning For Complex High-dimensional Data
4	Fine-grained Image Classification In Zero-shot Learning
5	Research On Few-shot Image Classification Algorithm Based On Deep Discriminative Feature Learning
6	Tensor Representation And Semantic Modeling For Image Annotation
7	Study On Few-shot Learning Based On Deep Learning For Image Classification
8	Few-shot Image Classification Algorithms Research And Applications
9	Research On Brain Image Recognition Algorithm Based On Tensor Decomposition
10	Research On Several Issues Of Image Generation And Recognition Based On Deep Learning