Research On XML Information Retrieval

Posted on:2013-03-22

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y L Wen

Full Text:PDF

GTID:1268330395987572

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid spread of XML technology, XML has become the standard formatfor data representation and data exchange on the Web. There are a huge number ofXML documents in many domains. It becomes a hot research topic that how toretrieve XML data efficiently and effectively among database and informationretrieval research communities. There are rich solutions in unstructured data retrievalwith traditional information retrieval techniques. But XML data is semi-structuredwith both content and structure, and brings new challenges to information retrievalresearch. It becomes a novel research idea that XML data is retrieval with databaseand information retrieval.This paper analyzes research status of XML information retrieval, considerssolutions with database and information retrieval, and addresses some crucialproblems which are related with XML data retrieval, include XML keyword search,XML content and structure search with vagued structure context, and XML full textsearch based on relational database. The main contributions and innovations include:This paper proposes an approach of keyword search over XML documentsbased on Candidate Fragment semantic. This method first filters candidatenodes according to number of descendants and attribute type numbers ofXML tree nodes, and then constructs candidate fragments centered fromcandidate nodes. After indexing these candidate fragments by inverted list,this method answer user queries with candidate fragments or candidatefragments with ancestor-descendant relationship which satisfy all keywordsand adapt the characteristic of XML dataset. Experiments show thatCandidate Fragment semantic can provide users compact, meaningful andproper size results and have good performance on XML keyword search.This paper proposes an approach to retrieval XML data with vague structuralcontext. We processes user query and XML documents as structural term set. Context resemblance is computed based on level weight of element incontext, level similarity between elements of longest matched context, andother factors. We extends Vector Space Model to answer XML content andstructure search. Experiments show that our method has good performanceon XML content and structure search.This paper proposes an approach of XML full-text search method based onrelational database, named as ReXFT. ReXFT maps XML data into relationalstorage based on NXRel, and can naturally reflect the logical model of XMLdata. ReXFT allows users to create XML full text index on user defined pathsbased on full text element nodes. W3C Recommendation is adopted inReXFT to submit user XML full text search to fit the international standards.ReXFT scores search results based on cover density ranking schema, takinginto account the logical relationship between search terms, distance,frequency and other factors. Experimental results show that ReXFT has goodperformance in the processing of XML full-text search.

Keywords/Search Tags:

Keyword Search, Content and Structure Search, Structural Context, Full Text Search, Cover Density

PDF Full Text Request

Related items

1	Full-text keyword search in meta-search and P2P networks
2	Research And Realization Of Full-Text Search Technology
3	The Design Of Materials Based On B / S Structure Search Platform
4	The Research On Method Of Database Search Based On P2P Search Engine
5	Design And Implementation Of Content Based Shape Search Platform
6	The Research Of Auto Complete Box In Silverlight Based On Call Center System
7	Research On The Encrypted Data Supporting Synonymous Multi-keyword Fuzzy Search In The Cloud Computing
8	Research On Conjunctive Keyword Search Over Encrypted Data In Cloud Computing
9	Design And Implementation Of Hotel-Ordering Platform Search System
10	The Design And Implementation Of Cross-language Navigational Search Engine