Font Size: a A A

Research On Geometric Solution Method Based On Cross-modal Learning

Posted on:2024-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:F C GuoFull Text:PDF
GTID:2568307127968429Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The machine solution of geometric problems(geometric solutions)is of great significance for building intelligent education systems and evaluating students’ learning levels.There are two common forms of presentation for geometric problems.One is a geometric problem that only contains a single modal text description,abbreviated as a geometric text problem;Another type of geometric problem that includes a multimodal text and image description,abbreviated as a geometric graphic problem.Previous research mainly used text feature encoding to understand single modal problems,while multimodal problem understanding mainly used the dual tower model,which uses two encoders to encode images and text separately,and then fuse the encoder output.In the solution stage,both single modal and multimodal problems mainly use sequence modeling methods to generate solution steps.However,using different models for solving single modal and multimodal problems limits the universality of the solver,and there is a certain similarity in the understanding of single modal and multimodal problems in model framework design.To solve this problem,this paper proposes a cross modal feature learning framework,which encodes the features of single-mode and multi-mode topics in a single model to further improve the feature representation ability of the model.There are significant differences in modal expression between single mode and multimodal.To address this issue,this paper proposes a cross modal contrastive learning model to enhance semantic associations between cross modal geometric data.On the other hand,the interpretable solution of geometric problems can guide students to think,but it is an important issue in intelligent education that has not received sufficient attention in current research.Therefore,this article proposes a geometric solution method based on cross modal learning,which achieves cross modal geometric problem solving through a cross modal feature learning framework and a cross modal comparative learning model.We conducted research on the interpretability of geometric problems and constructed a framework for interpretable geometric problem solving based on graph convolutional networks.The main work and contributions of this article are as follows:(1)A geometric solution framework based on cross modal learning is proposed,which completes problem solving in an end-to-end manner.The framework takes single mode or multimodal geometric problems as input and outputs a readable problem-solving process that can further yield computational results.The single frame can adaptively solve single mode and multimodal geometric problems without modifying the model structure,and has good generalization ability.(2)A shared feature learning model for cross modal data is constructed to extract text(and image)features of geometric problems.The cross modal feature learning architecture adopts a network similar to a self attention mask,which enables the extracted cross modal features to be converted into a unified feature representation to solve the modal heterogeneity of cross modal geometric problems.A comparative learning model for cross modal data is proposed to enhance the semantic correlation between cross modal features and map them into a unified semantic space,effectively adapting to single modal and multimodal geometric problem solving tasks.(3)Starting from the interpretable solution of geometric problems,a framework for geometric problem solving based on graph convolution neural networks is proposed.This framework extracts the relationship sets in geometric topics through a rule-based text parser and a geometric image parser,and constructs a relationship graph through the relationship sets.The graph convolutional neural network is further used to encode the graph,preserving the structural information in the graph,and used for geometric theorem prediction.Finally,the interpretable solution of geometric problems is completed through theorem reasoning.The geometric image parser is constructed using the improved Retina Net method,and two auxiliary tasks are proposed to fine-tune Dense Net to adapt to geometric image feature extraction.In summary,this paper proposes a geometric problem solving framework based on cross modal learning and a geometric problem solving framework based on graph convolution neural network around the task of cross modal geometric problem solving.The effectiveness of the proposed method in solving accuracy,geometric symbol recognition accuracy,and interpretable solutions is verified through experiments and analysis on single mode and multimodal geometric problem datasets.
Keywords/Search Tags:Geometry problems, Cross-modal feature learning, Cross-modal contrastive learning, Interpretable solving
PDF Full Text Request
Related items