Font Size: a A A

Research, Concept-based Information Retrieval Model

Posted on:2007-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:B W YangFull Text:PDF
GTID:2208360185956669Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Since always, there are two kinds of entirely different theories in the Natural Language Processing field, namely rationalism and empiricism. The emphasis of rationalism is rule-based semantic analysis, and it attempts to find an absolutely effective solution to problems. But the empiricism depends upon the language facts that already exist and achieves the maximum benefit through statistical methods.This paper embarks on from rationalism Natural Language Processing and proposes the method of expressing concept using the Dynamic Attribute Set according to the theory of Conceptual Dependency and Complex Attribute Set. Then the paper briefly analyzes the process of constructing the Dynamic Attribute Set via unification, proposes the matching theory that Conceptual Attribute Set can be applied to information retrieval, and basically discusses the realization of information retrieval based on this theory. Finally the paper concludes that the necessary and sufficient condition that a document matches a query is that the document must contain all the Conceptual Bases that appear in the query and be consistent with the relationship among Conceptual Bases in query.But it is found that the matching theory of Conceptual Dynamic Attribute Set has to face many problems such as the maintenance of a huge rule set, just as many other rationalism methods. Therefore this article switches to the Statistical Natural Language Processing and seeks the best way to implement the above conclusion. Finally a new language model named Section Language Model in the information retrieval field is proposed.The Section Language Model makes the improvement on the traditional statistical language model in two aspects:Firstly, aimed at the situation that the Conceptual Base can possibly correspond to many words in language, but the words in query are merely its particular cases, this paper has introduced the Correlation Vocabulary Table. It contains all the possible words that may correspond to each Conceptual Base. When constructing the language model, not only the query's words are considered, but also all the words corresponding...
Keywords/Search Tags:conceptual base, statistical language model, information retrieval, Section Language Model
PDF Full Text Request
Related items