Font Size: a A A

The Extraction And Representation Of Entity Information In Plane Geometry Images

Posted on:2018-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:S C LaiFull Text:PDF
GTID:2348330518983399Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid technology development of modern education and artificial intelligence, researches on the method and technique of machine solver for the primary subject problems has become popular again. Since mathematics is a subject based on quantities and relationships, it is a good starting point to explore the machine solving of mathematic problems to guide the developments of the general machine solver techniques.In this paper, we took a research about the extraction and representation of the geometry entity information in the plane geometry images. And regarding the situation due to the overlapping structure of the geometry entities or the existence of the dash lines, we tested several algorithms for the detection tasks taking the features of plane geometry figures into account, and many post optimization strategies and procedures are proposed to obtain robust results and fair detection precision. Then the useful geometry entities information can be extracted in a further place whose information not only can be represented consistently to the students and help them understand and explore the answer of the problem, but also can be integrated with the information extracted from problem texts to form a more complete information set, which will contribute to the implementation of automated machine solver.This paper consists of two major parts. The first part is mainly about the detection of geometric entities, including mainly the image preprocessing procedures, the geometry entities detection procedures, and the post optimization procedures. Through the analysis of experiments in this paper, we adopt the adaptive thresholding technique with a Gaussian kernel to binarize the image, and then separate the plane geometry figure region and character region respectively using an 8-way connected component labelling procedure.As to the plane geometry figure region, the RANSAC method is used to detect the circle entities, and those pixels of circle entities will be removed hereafter, then the progressive probabilistic Hough transform is used to detect the line segment entities, at last, lots of post optimization procedures are taken to gain a more robust detection frame, including the correction of labelling and dash line problems.And the second part is about the extraction and representation of the geometry entities information, including the OCR procedures, the entity information extraction procedures,and the entity information representation procedures. First, we trained and applied a self-designed BP neural network for the OCR process, and integrated the text labels with the respective point entities of the smallest distance from the current region centroid. Based on the coordinate system, we then purposed a method to extract the useful information of the geometry entities and represent them in 3 uniform forms, namely,the form of extended predicates, the form of general equation system and the form of natural language.In conclusion, this paper proposed a general framework about extraction and representation of the geometry entities information, and lots of experiments are carried out within a self-collected plane geometry image dataset,which also proves the rationality and robustness of the framework.
Keywords/Search Tags:Machine Solver, Plane Geometry Figure, Information Extraction, Uniform Representation, Hough Transform, RANSAC, OCR
PDF Full Text Request
Related items