Research Of Mongolian Information Retrieval Model

Posted on:2010-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:W Jin

Full Text:PDF

GTID:2178360278467595

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The Web is becoming a universal repository of human knowledge and culture which has allowed unprecedent sharing of ideas and information in a scale never seen. But because of the difference of each language, it is still short of the research on minority languages. And it is severity encumbrance to the spread of minority languages. Mongolian is one of the most important languages in the world. So the research of Mongolian information retrieval becomes more and more important.In order to construct a search engineer which is fit for Mongolian, we analyzing the Mongolian characteristic in morphology and syntax, and designed the scheme of indexing units for Mongolian IR, including partitioning the Mongolian term and the rules for Mongolian stemming; We use three methods to determining the Mongolian stop list; After analyzing other information retrieval models we find the right model which is fit for Mongolian and according to these experimentations we compare the effect of the smoothing methods, Mongolian stemming and query structured method.We have collected 27345 Mongolian corpus, construct a Mongolian document sets, 11 topic and the relevance judgment collection and run our Mongolian test collection on a model combines the language modeling and inference network approaches to information retrieval using Indri; According to the experimentations, the Mongolian stemming can reduce the index and enhance the precision, the EC Mongolian stop-list has the best effect; The Mongolian stemming rules can reduce a lot of terms and enhance the recall; Compare the effect of the other information retrieval models, the model of combining the language modeling and inference network approaches has the best effect; Determining the best smoothing parameters, all of the three methods are fit for the models, but Jelinek-Mercer smoothing is better than others.

Keywords/Search Tags:

Mongolian IR, Inference Network, Language Model, Structured Queries, Mongolian Stemming

PDF Full Text Request

Related items

1	Research On Mongolian Lexical Analysis Based On Combination Of Statistical And Rule Approaches
2	Research On The Conversion Approach Between Cyrillic Mongolian And Traditional Mongolian Based On Rules And Statistics
3	Research On Translation Methods Of Query Items In Chinese-Mongolian Cross-Language Information Retrieval
4	Design And Implementation Of Electronic Medical Records Retrieval System
5	Research On Chinese-Mongolian Cross-Language Information Retrieval Based Language Model
6	The Research On Constructing Layered Mongolian Laguage Model
7	Research On Mongolian-Chinese Cross-language Information Retrieval Model
8	Design And Implementation Of Intelligent Mongolian Input Method
9	The Design And Initial Construction Of Experimental Mongolian Language Q/A System Based On Network
10	Research And Implementation Of Mongolian-Chinese Mixed Language Speech Recognition System Based On Deep Learning