Research Of XML Information Retrieval System Based On Element Links

Posted on:2011-05-11

Degree:Master

Type:Thesis

Country:China

Candidate:J B Yu

Full Text:PDF

GTID:2178330338976297

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

XML information retrieval is a technology developed from the traditional information retrieval, integrated the database field with the information retrieval field. Research indicates that the element links in XML document influence not only the content of element, but also the structure, so that influence the result of XML information retrieval. Based on element links this paper researches on XML index, XML information retrieval model and redundancy information pruning technology.Firstly, we propose a new kind of XML index technology based on element links, which include two parts, the external links index and the inner elements index which based on Pseudo Dewey coding. The Pseudo Dewey coding is based on schema, which the coding of an element is depending on the location of the element type in the schema and element order, and so on. Meanwhile, the inner elements index organizes its structure based on criteria, such as keyword types, the logic size of coding. The experiment result shows that this index technology has the characters of supporting element links, good efficiency in retrieval and lower updating time costs. Secondly, we introduce a new XML information retrieval model based on graphic model, the new model take the influence of element links into account, then we calculate the relativity of contexts according to the size, location, proportion of the common descendant sequences, and deduce the context relativity matrix of the model. At last, we extend the traditional vector space algorithm to calculate the relativity between elements and user retrieval sentences, improve the precision and recall of the retrieval result consequently. Finally, we establish a Markov chain user navigation model based on user retrieval sentences, and deduce the transition probability matrix according to the user browse history records and the context of elements. Then we introduce a redundancy information pruning technology, which based on ideal relativity of results set, and its greedy optimization approach. The experiment result demonstrates that the greedy optimization approach has the properties of lower time costs, good execution efficiency, and it has more practical worth.

Keywords/Search Tags:

XML Information Retrieval, Element Links, XML Index, Pseudo Dewey Coding, Graphic Model, Markov Chain, User Navigation Model

PDF Full Text Request

Related items

1	Research On XML Retrieval And Indexing Methods
2	Research On Personalized Search Method Based On Language Model
3	Study On The Prediction Of Shanghai Composite Index Based On GA-Markov Chain Model
4	Research On Requirements Trace Links Generation Method Based On Hybrid Information Retrieval Model
5	Reasearch On The Web Personalization On Makrov Model
6	Research On Markov Graph Model In Information Retrieval
7	Research On Information Retrieval Models Based On Reference Document
8	On The Information Mining Algorithms Of Networked Collective Intelligence Based On Causality
9	Information Retrieval Model Based On Markov Network
10	The Research On The Information Retrieval Based On The Markov Network