Font Size: a A A

Research On Active Learning Based Automatic Corpus Annotation

Posted on:2011-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Y SongFull Text:PDF
GTID:2178360308952410Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Opinion Mining aims to automatically acquire useful opinioned information and knowledge in subjective texts. Research of Chinese Opinioned Mining requires the support of the annotated corpus for Chinese opinioned-subjective texts.Since the annotated corpus for Chinese opinioned-subjective texts includes much information including word segmentation, part-of-speech tag, dependency relationship, word meaning, and opinion, the finished annotations are usually very complicate. To relieve the burdens of annotators, increase the efficiency and accuracy of annotation, and reduce the possibility of false annotation, it is necessary to develop an automatic annotation tool to facilitate annotators'work.This paper implements an active learning based annotation tool for Chinese opinioned elements. It can identify topic, sentiment, and opinion holder in a sentence automatically. Active learning algorithm is featured with smaller training set size, less influence from unbalanced training data and better classification performance comparing to classical learning algorithm. This paper experimentally demonstrated the validity of active learning algorithm when used for opinioned elements identification and proposed a formula for overall system performance evaluation which consists of F-measure, training time, and training instance number.
Keywords/Search Tags:Opinion Mining, Corpus, Corpus Annotation, Active Learning, Topic Identification
PDF Full Text Request
Related items