Semantic Embedding Representation Model Of Multimodal Test Questions For Test Question Duplication Detection

Posted on:2024-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:J Xiong

Full Text:PDF

GTID:2568307112476434

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As an important means of selecting talents and measuring educational standards,the national education examination emphasizes its authority and fairness.For national examinations,the quality of their questions is not only related to whether they can effectively judge the candidates’ ability and quality,but also related to the fairness of the society,among which,ensuring that the test questions do not duplicate those that appeared in the past is a prerequisite to ensure the quality of the test questions.However,with the development of education and the Internet,the number of test questions including those from schools at all levels,out-of-school training institutions and various teaching materials has increased year by year,and the existing large number of test questions plus the new ones appearing every year have led to a large number of test questions in each subject,which is too difficult and inefficient to rely on manual checking.Therefore,the establishment of an effective test question checking system to avoid duplication of test questions and thus ensure the authority and fairness of the national examinations has become an urgent problem to be solved by the education examination department in China.In this paper,we evaluate the methods of calculating test similarity for the problem of test question checking in national education examination propositions,and address the following problems in the current test question checking task: firstly,there are duplicate test questions,but their textual representations differ greatly;secondly,there are large differences between test questions and ordinary texts,which contain not only textual content but also multimodal information such as images;finally,how to embed the semantic embedding of test questions representation is combined with the task of checking the weight of national examination questions.Therefore,the knowledge point information is incorporated into the semantic embedding representation model of the test questions for the existing methods that are difficult to retrieve the repetitive test questions with different representations.For the multimodal information contained in the test questions,a multimodal-based semantic embedding representation model for test questions is proposed.Finally,the semantic embedding representation model of test questions is applied to the task of checking the weight of national examination questions.In summary,the main research contents and contributions of this paper are:1.Repeated test questions in physics subjects were collected and annotation specifications were established by analyzing the relationship between similar test questions and knowledge points.A duplicate test question dataset was constructed by using a combination of automatic annotation and manual correction for training a semantic embedding representation model for test questions.1.a semantic embedding representation model of test questions fused with knowledge point information is proposed.The model uses a double encoder structure to extract the semantic information of the test questions,obtains the semantic embedding representation of the test questions by mean pooling,and portrays the similarity of the test questions by cosine similarity.2.A multimodal-based semantic embedding representation model of the test questions is proposed,which uses the multimodal data in the test questions to learn the semantic information of the test questions.Firstly,we use convolutional neural network to learn the feature representation of the corresponding image of the test question,then we use pre-trained language model BERT to obtain the semantic representation of text and knowledge points,and finally we use Transformer to fuse the semantic information of the text of the test question,the corresponding image of the test question,and the knowledge points,and finally we get the final semantic representation of the test question.3.Repeated test questions in physics subjects were collected and annotation specifications were established by analyzing the relationship between similar test questions and knowledge points.A duplicate test question dataset was constructed by using a combination of automatic annotation and manual correction for training a semantic embedding representation model for test questions.Finally,the effectiveness of the semantic embedding representation model of the test questions proposed in this paper is verified by conducting experiments on the manually annotated test question corpus.

Keywords/Search Tags:

Test Questions Duplication Detection, Test Questions Semantic Similarity, Semantic Embedding Representation of Test Questions, Multimodal, Pre-trained Language Models BERT

PDF Full Text Request

Related items

1	Building Of Intelligent Item Bank System
2	Research On The Test-questions Similarity Detection And Classflcation Based On RNN
3	The Basic Research And Production On Web-based Examination-questions Bank For Sports Academic Test
4	Intelligent Indexing System Of Mathematics Test Questions
5	Intelligent Test Paper Generation System On Objective Questions Of C Programming Language
6	Research On Intellectual Evaluation Method Of Chinese Subjective Questions
7	Research On Test Question Classification And Similar Test Questions Detection Based On Domain Knowledge
8	Research And Implementation Of Talent Evaluation System With Questions Of Complicated Features
9	Research On Questions Similarity Computing In Q&A System
10	The Research And Implementation Of Test Questions Quality Control Basing On The Standardization And Word Errors Checking