Graph matching aims to establish the node mapping relationship between two graphs and is a classic NP-hard combinatorial problem.In recent years,many researchers have begun to use deep learning techniques to build end-to-end trainable architectures to solve graph matching problems.This end-to-end pipeline architecture is called a deep graph matching model,which consists of two important modules,a graph representation learning module and a graph matching problem solver.However,in the current research on deep graph matching models,there are two important issues to be solved.First,mainstream deep graph matching models usually perform continuous relaxation on the graph matching problem,so that the model can optimize trainable parameters based on gradient descent.However,there is currently no research on whether the learning-based solution technology should continuously relax the graph matching problem,and what is the relationship between the solution quality of the graph matching problem and the matching accuracy of the model;The limited learning ability of the nodes leads to the poor distinguishability of the embedding representation of the nodes,which makes the node affinity measurement inaccurate,and ultimately leads to the loss of the matching accuracy of the model.The research content of this paper is carried out around the two important modules of the deep graph matching model,and aiming at the above two key issues,combined with the integer linear programming and graph representation learning technology,the deep graph matching model based on differentiable integer programming is studied.Meanwhile,this paper quantitatively evaluates the effectiveness of the research work in this paper with the key point matching task as the application background.The specific research contents and main contributions are as follows:(1)To analyze and study the relaxed graph matching method in the deep graph matching model,this paper proposes two important conjectures.First,the output of the deep graph matching model is equivalent to the optimal solution of the graph matching problem.Second,the continuous relaxation of graph matching will lose the solution accuracy of the graph matching problem,and this loss is difficult to compensate by the graph embedding technique.The above two conjectures are verified by three experiments,and the results show that the solution quality of the graph matching problem is positively correlated with the matching accuracy of the model.Moreover,since the graph embedding technique is applied to learn the higher-order embedding representation of nodes,which reduces the difficulty of node similarity measurement,it is feasible to appropriately relax the constraints of graph matching.(2)On the basis of research(1),this paper presents the MILP formulation for the exact solution of graph matching problem,and designs a differentiable integer linear programming solver(DIP)to solve it.The solver computes the MILP formulation through a quality-aware solving algorithm,and then constructs meaningful gradient information with the help of implicit interpolation techniques,enabling it to be embedded as a black-box into the trainable network architecture.In order to test the effectiveness of DIP and verify the two assumptions proposed in(1),we equip the existing model with the solver,and call this modified model DIP-GM.The experimental results show that DIP is scalable,can be used as a black-box in architecture,and improves the matching accuracy of the deep graph matching model.(3)After the research of(1)and(2),it is known that the matching accuracy of the deep graph matching model depends greatly on the feature learning ability of the graph representation learning module.Therefore,starting from improving the learning ability of the graph representation learning module,this paper designs a graph embedding network GSAN based on the self-attention mechanism.The network consists of a spatial encoder and a multi-head attention mechanism.First,the network uses a spatial encoder to embed structural information into the representation vectors of nodes,and then uses a multi-head attention mechanism to learn the associations between all nodes to refine the node features.At the same time,in order to add more global information,we also introduce node betweenness as a supplementary feature.(4)Integrating the above three research works,this paper finally designs an end-to-end trainable architecture based on a differentiable integer linear programming solver,called GSAN-GM.GSAN-GM first uses GSAN for graph embedding to obtain higher-order embedding representations of nodes.Then,the edge representation vector is constructed according to the obtained node representation vector,and node affinity measurement and edge affinity measurement are performed.Finally,solve the graph matching problem with the solver and return the obtained solution as the result of the model.This paper trains and validates the effectiveness of GSAN-GM on two public image datasets,and the results show that it not only generalizes across classes,but also handles input graphs of different sizes.Meanwhile,the average matching accuracy of GSAN-GM exceeds most deep graph matching models and outperforms the state-of-the-art matching accuracy on multi-class tasks,such as bicycle,bus,table,etc. |