Font Size: a A A

Design And Application Of Relative Semantic Lib Based On Digital Television Program

Posted on:2009-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:D F GuoFull Text:PDF
GTID:2178360242476737Subject:Computer Software and Theory
Abstract/Summary:PDF Full Text Request
Judging the relativity among words plays a significant role in human's cognitive processing to language. Word relativity is broadly used in many applications, such as information retrieval, text classification, and machine translation, etc.There are mainly two ways to construct a semantic corpus: one is based on models like word net; the other is material training. Method based on word net is subjective and cannot update itself, while material training cost too much in both time and space complexity. To overcome this problem, this paper introduces a method constructing a semantic corpus which is based on a vector space model.By the system requirements and intelligent learning theory, the corpus is constructed by considering manifold factors co-occurrence times, average distances, window size etc. which may have effects on semantic relationship between words, and training a large amount of texts with an iterative process of learning. Experiments manifest that the corpus is able to accurately reflect the relativity among words in the realistic world. Then the corpus is applied to the field of information retrieval for the purpose of proving its practical value. The model is in fact a four-dimension vector in structure. With relative semantic corpus, the paper introduces a fuzzy matching algorithm based on an expanded vector space. Finally, through an experiment that focuses on the recommendation of TV programs the model is proved to be able to recommend a larger amount of TV programs that corresponds to specific search requirement.
Keywords/Search Tags:semantic corpus, vector space, word relativity, text training, fuzzy matching
PDF Full Text Request
Related items