Font Size: a A A

The Active Learning Research On Chinese Word Meaning Acquisition Based On Visual Information

Posted on:2013-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:T M WangFull Text:PDF
GTID:2248330371966448Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cognitive science research has shown that visual, auditory, tactile and other information in the process of human perception of language acquisition plays an important supporting role, in which visual information is particularly prominent. In order to make use of relevant visual information in processing language information in a computer, grounded semantic acquisition research based on perception information has appeared.The ViMac system (Visual Information based Meaning Acquisition of Chinese Words) is a natural language generation system based on visual information, it learns the lexical semantics of the visual information and can generate natural language description for simple geometry picture. But the performance of the system subject to the initial training corpus size, it is recommended to add new samples to training corpus in order to improve system performance. The large-scale corpus annotation costs much time and effort, so this thesis introduced active learning technique into the ViMac system, which is a technique that selects most valuable samples in the non-labeled corpus and minimizes the number of samples selected under the premise of not affecting the model performance.This thesis makes two improvements on the uncertainty-based active learning framework:first, aiming at the situation of serious imbalance of the sample distribution in the training corpus for each class, uses posterior probability weighted entropy to improve the uneven distribution of state; second, with the use of clustering, weighting method that considering the uncertainty, affecting degree and redundancy of the sample, selects multiple samples in once selecting cycle. Then introduces the active learning mechanism nto ViMac system gradually, adding small number of new samples that have a positive effect on the system performance, and finally builds up online active learning system ViMac-Online.
Keywords/Search Tags:ViMac, active learning, support vector machine, lexical semantics acquisition
PDF Full Text Request
Related items