The Research On Word Relatedness Based On Concept Network In News Corpus

Posted on:2012-07-04

Degree:Master

Type:Thesis

Country:China

Candidate:J P Liu

Full Text:PDF

GTID:2178330332467456

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Word relatedness is the measurement of the relationship between two terms and the research of word relativity is a basic research topic in the fields of nature language processing. The determination of relevance of any given word combination is a profound issue in many applications of nature language processing (NLP), such as document clustering, word sense disambiguation, semantic web, information retrieval. But the method used to compute word semantic relatedness is just to use a dictionary or corpus and don't combine both of them. In this paper, a new method is proposed to compute the relativity of terms in the news corpus which is base on the concept network.There are two main methods to compute word relatedness, one is to count the co-occurrence use the corpus; the drawbacks of this method is that the statistic can't show the inherent relationship between words. The other method is to use a dictionary or domain ontology. The commonly appreciated ontology delicately created by the experts and the relationship in the ontology is subjective based on personal understanding and also insensitive meaning that it can't evolve with time and is hard to absorb new words. The present word relatedness research pays close attention to single words pair and ignores the relation between words.In this paper, we propose solutions to solve the above problems. Firstly, we construct a news corpus, and make use of the feature of news to count the word's co-occurrence. Secondly we introduce the Wikipedia to void the aforementioned drawbacks caused by statistics of the corpus. Then we construct a concept network using the result of SWRN-W (single word relatedness computation algorithm for news corpus based Wikipedia). We compute the word relatedness using the weight of path in the network which could overcome the drawback of word isolated.The experimental results have demonstrated the advantage and validity of our proposed methods. And the method we proposed will give a good exploration to the research of word relatedness.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Research On Concept And Short Text Semantic Relatedness Calculation Method
2	Research On Computing Concept Relatedness Sasead On Ontology
3	The Study Of Concept Relatedness Degree Of Process Service Attributes
4	Research On Chinese New Word Discovery Technology Based On Large Scale Network Corpus
5	Research On Statistical Word-level Semantic Relatedness Computation
6	The Description Of Text's Feature Based On Semanteme Concept
7	Scientific Paper Discrimination Method Research Based-on Word Co-Occurrence Network And Support Vector Machine
8	Domain Concepts Automatically Extracted
9	Bilingual Term Extraction Based On Parallel Corpus
10	Hot Topics Detected From Micro-bloggings Based On Word Co-occurrence Model