Font Size: a A A

Research And Design Of The Question Answering System Of The Dean Mailbox

Posted on:2019-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:J XieFull Text:PDF
GTID:2428330545988411Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the information age,the rapid development of Internet technology the question answering system has been widely used.The question answering system mainly uses knowledge of Natural Language Processing to analyze the questions raised by users,and uses information retrieval technology to obtain the best answer.At present,along with the development of informatization construction in universities,question answering system is also more and more appear in educational administration,Dean mailbox answering system through the analysis of historical data,will recommend similar questions and answers to the students,to facilitate the rapid and accurate answers,but also reduce the workload of educational management.In the whole process of question answering system,problem understanding technology directly affects the accuracy of question answering system,and more and more experts and scholars pay attention to the extraction of question information.Question classification and question similarity calculation are effective methods to obtain information,but due to questions in the Dean mailbox exists sparse high,the use of traditional text classification and similarity algorithm can not achieve good results,so the research focus of this thesis for the calculation of sentence similarity algorithm and classification problem.First,in view of the poor effect of traditional question classification algorithm for sparse data classification,the question classification method of fusion word vector and BTM model is proposed.First,the word vector training tool Word2 vec is used to express the word vector of the problem corpus and the answer corpus,BTM topic model is established and has been the topic vector of the text,the word vector and theme vector stitching,get the final vector representation of the text,and finally use the SVM(support vector machine)for text classification.The test results show that the accuracy of the algorithm is obviously improved by using 6 categories of real data,such as the Dean mailbox of the Computer College of Chongqing University of Technology.Second,in the sentence similarity algorithm,to solve the influence of the amount of information loss caused Chinese segmentation,semantic lexical entry sequence on larger issues,using public chunks and interrogative sentence similarity algorithm based on N-gram model,considering the public words and word order block,at the same time use unigram and bigram model,calculate the weighted method to obtain the final similarity of two kinds of word segmentation.It is proved that the accuracy of the new algorithm is better than that of the traditional algorithms.At the end of the thesis,the design and implementation of the question answering system of the Dean mailbox is completed.After the study of the algorithm,the program completes the system function.The system can answer the students' questions accurately and recommend the relevant questions and answers.
Keywords/Search Tags:QA, question understanding, question classification, question similarity, question representation
PDF Full Text Request
Related items