Font Size: a A A

Research On XML Keywords Retrieval By Integrating Semantics Of Document And User Inquires

Posted on:2012-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2178330335956057Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, the number of XML documents increases exponentially, XML keywords retrieval becomes a research hotspot in recent years. A XML keywords schema-free retrieval method which integrates documents and the semantics of the users'was proposed to deal with the loss of semantic information in XML keyword retrieval. It can improve query accuracy by increasing the semantic relevance of query results.The current thesis covers the following five parts:Firstly, after analyzing the structure of XML document tree with the corresponding schema information and mining the implied relationship between each node in the document, divided these nodes into entity, attribute and value nodes. The classification based on nodes can store the information implied in the nodes to get the semantic information of the query documents by using the Entity Sub Trees that can help express the basic semantics of a document.Secondly, specified the users'keywords query expression and clarified the implied semantic relationship among keywords by analyzing the query keywords; Divided the query keywords into predict keywords and result keywords. The predict keywords are used for retrieval while result keywords are mainly used for generating query results.Thirdly, proposed an improved algorithm based on the similarity among the calculation concepts of WordNet. Considering the asymmetry of different concepts and combining the similarity calculation method, the algorithm achieves the XML schema-free query.Fourthly, studied the query results based on the semantics of the documents and the query semantics of the users', proposed a new set of query results which refers to the Semantically Related Entity Sub Tree Set. It helps improve the algorithms for Smallest Lowest Common Ancestor and get the algorithm for Semantically Related Entity Sub Tree Set.At last, proved that the proposed XML keywords schema-free retrieval method, compared with the traditional keywords query algorithm, can capture the user's query intent more accurately and meanwhile it is more satisfying in query effectiveness and efficiency.
Keywords/Search Tags:Conceptual Similarity, Keywords Retrieval, Smallest Lowest Common Ancestor (SLCA), Semantically Related Entity Sub Tree Set (SRESTS)
PDF Full Text Request
Related items