Font Size: a A A

The Semantic Interpretation Of Noun-noun Compound

Posted on:2011-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:T W ZhangFull Text:PDF
GTID:2178360308452391Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays, internet is becoming major information resource. While human want to find information, usually they first find searching engine for help, using compound words to do query. Today keyword matching is major method used in information retrieval, so the whole query is broken into words and the compound becomes pieces with its hidden semantic relation lost. That results in the relatively low precision. On the other side, compounds are used a lot in documents. So it's essential to unveil the semantic relation hidden in the compounds.Specifically, this thesis addresses on the research of noun-noun compounds. The noun-noun compounds are compounds consists of only nouns and acts as a new noun. Comparing to phrase and sentence, the compounds are lack of clues of the combination, so it's difficult to analyze the semantic relation.The thesis begins with analyzing real noun-noun compound, i.e. the author tries to manually label the semantic relation between elements of the compound. Then the author tries to find combination clues from web and corpus to collect contexts and extract patterns. Clustering analysis based on the patterns is done to the compounds.The main contributions are as follows:First, the foundation of compounds analyze is the research and analysis of compounds themselves. The author analyzes real noun-noun compounds examples to summarize several principles of conceptual analysis and explore ways to unveil semantic relation of noun-noun compounds.Second, the author tries to collect contexts of the compounds from web and corpus. Some patterns are extracted from the contexts. Then the semantic relation is represented as a vector. Each dimension is a pattern, the score is calculated in a variation of TF-IDF.Third, the author suggests a way to measure similarity of semantic relation of the noun-noun compound. Based on the similarity measurement, compounds are clustered. The compounds with similar semantic relation are put in one cluster. Then the analysis could be done on each cluster rather than each compound, so a lot of manual work would be saved.The author has done some exploration work and experiments trying to analyze the noun-noun compounds effectively and efficiently that aims to provide some new thought to the Chinese natural language processing.
Keywords/Search Tags:noun-noun compound, semantic relation, pattern, clustering analyze
PDF Full Text Request
Related items