Font Size: a A A

Tagging Vectorized POS For Math Word Problem Based On BERT And NLPIR

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiangFull Text:PDF
GTID:2427330605458606Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,driven by the demand of intelligent education and the technology of machine learning,multimedia technology and natural language processing,the research interest of machine solver has been increasing,and the machine solver of math problems in primary and middle schools has become a research hotspot.The machine solver of mathematical word problems is generally divided into two steps.The first step is problem understanding,and the second step is problem solving,in which problem understanding is the basis of problem solving.As the core of problem understanding,part of speech tagging plays a key and fundamental role in the study of mathematical word problem solving.The traditional part of speech tagging for mathematical word problems uses grammatical marks,and there are some problems that computations cannot be directly performed and it is difficult to introduce deep learning into the machine solver technology.These problems restrict the development of machine solver.To solve these problems,it is necessary to vectorize the part of speech for mathematical word problems.This thesis is devoted to solving the problems above,and contributes to the study on the part of speech tagging method for mathematical word problems.The main contents are listed as follows:1.After analyzing a variety of word vectorization methods,BERT model is finally adopted as the vectorization method for mathematical word problems.NLPIR system is used as the word segmentation tool of mathematical word problem text by comparing various word segmentation systems.By combining BERT and NLPIR,a vectorized part of speech tagging method for mathematical word problems is proposed.Different from traditional part of speech tagging method for mathematical word problems,this method can carry out vectorized part of speech tagging of mathematical word problems,which means it can perform direct computations.Therefore it is convenient to introduce deep learning technology into the machine solver research.2.The effectiveness of vectorized part of speech tagging in this thesis is verified by experiments.Experiments are designed on four kinds of words(noun,numeral,quantifier and time word)in the sematic model pool of mathematical word problems.Five statistical indexes(expectation,median,maximum,minimum and variance)of Euclidean distance and cosine similarity are used to calculate the vector similarity,evaluating the vectorized part of speech tagging.At the same time,a linear neural network with only one hidden layer is designed for classification,and the accuracy of classification reaches 96.71%,which verifies that the vector has an obvious characteristic of part of speech differentiation.
Keywords/Search Tags:Mathematical word problems, Machine solver, Part of speech tagging, Natural language processing
PDF Full Text Request
Related items