Font Size: a A A

Research On Semantic Technologies In Natural Language Processing

Posted on:2015-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiuFull Text:PDF
GTID:2268330428976484Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Natural language processing is a hot research topic in computer science and artificial intelligence fields, and it focuses on the theories and methods of communication between people and computers using natural language. Because natural language processing involves in complex vocabulary and needs a large amount of corpus as research foundation, there are many unsolved technical problems, such as language behavior and plan, semantic analysis, word sense disambiguation and syntactic ambiguity, etc.Language behavior and plan includes the meaning of language and the reactions of human beings. The difficulty lies in how to understand the meaning that language expresses and then making a reasonable reaction. This is also an urgent problem in sentence similarity calculation. Currently, the existing methods of sentence similarity calculation have a good performance, but most of methods ignore the principal component analysis of sentences, for this reason, the similarity calculation results does not accurate. In this thesis, we firstly analyze the composition of sentence in detail based on the sentences analysis and syntactic analysis. Then we give different weight values to different ingredients of the sentence based on the above analysis. At the same time, we apply approach calculating length of weighted paths in optimal binary to sentence similarity calculation. At last, we apply the proposed method to information retrieval, and this method improves the precision of information retrieval. Experiment results show that the proposed method in this thesis is more reasonable and effective compared with other methods.Semantic relevancy calculation is one task of semantic analysis, and its main purpose is calculating the relevancy degree of different words in human thinking. There are many semantic relevancy calculation methods, but the methods based on online encyclopedia are favored more and more by people. However, in these methods entries’content are not analyzed comprehensively, so there are some deviations and errors in calculation results.We download Baidu encyclopedia as data set. Two entries form the entry pair. We analyze the contents and hyperlinks in pages of the entry pair. Relevancies of different parts in pages are calculated and given different weights. Relevancy between two entries combines relevancies of different parts in entry pages. After calculating relevancies of entry pair, we get the word pairs which have semantic relevancy and build the relevant word lists which is applied to expand questions for semantic search system. Experiment results show that the proposed method in this thesis is closer to people’s thinking model.
Keywords/Search Tags:Sentence similarity, Syntactic analysis, Binary tree, Word relevance, Baiduencyclopedia, Questions extension
PDF Full Text Request
Related items