Font Size: a A A

Research On The Recognition Of Focus Word In Chinese Question

Posted on:2014-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiFull Text:PDF
GTID:2268330401988884Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Question classification is a key step in automatically understanding questionwith natural language in Question Answering System. Finding features closelyrelated to the question’s category is very important in the improvement ofperformance and efficiency. Focus word, which is a word or phrase in a questionand can best express what the question asks, contains rich semantic information andare useful for classifying questions.Studies in this thesis are mainly about focus word in Chinese question. Indetail, studies are about to find a new focus word recognition method which makesuse of characteristics of Chinese questions for better recognition accuracy, and aimto improve question classification performance.Our contributions are as follows:(1) Considering the rationality of heuristic rules based on surfacecharacteristics such as part-of-speech(POS) and location, and the limitations beingeasily affected by the training set, a new recognition method combining thecondition random fields(CRF) and transformation-based error-driven learning (TBL)is proposed in terms of the investigation on the correlation between focus word andPOS, dependency relations or interrogative in the syntactic structure of question.This method mainly uses TBL, namely learning iteratively and rectifies therecognition result of CRF until the recognition results convergence steadily, andfinally gets ordered rules that can restrain the negative results of CRF. Besides,TBL is refined to save time during training ordered rules. Empirical results showthe validity of the method.(2) To further overcome the shortcomings of focus word recognition, thesemantic relationship of focus word and corresponding category for a question isstudied and a focus word recognition method based on category and semanticsimilarity is designed. In this method, semantic relationship between focus wordand question categories is used as a new training features for the CRF algorithm toimprove focus word recognition accuracy. Empirical results show the validity ofthe method.
Keywords/Search Tags:Chinese questions, Focus word, Condition random fields, Transformation-based error-driven learning, Semantic similarity
PDF Full Text Request
Related items