Font Size: a A A

Research On Content Reorganisation Based On XML Semantic Structure

Posted on:2019-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y S YeFull Text:PDF
GTID:2348330542965517Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of electronic information technology,the way of people's information acquisition is gradually diversified.Against this background,the traditional publishing industry merged the digital technology with the publishing process to seek a broader development and proposed the concept of digital publication.Compared with the traditional publishing industry,digital publishing uses the application of computer technology in the whole publishing process and the realization of the publishing technology and the innovation of delivering the product have become essential issues in current research.In digital publishing,content reorganization technology can effectively improve the reusability of the content of the publication,improve the editing efficiency and reduce the waste of resources,which is of great research significance.This thesis focuses on the research of key technologies in content reorganization,including the optimization of keyword-based retrieval of XML documents and the optimization of query results returned by XML documents.A multi-style reorganization model based on XML language is proposed.The most commonly used semantics of XML documents based on keyword search is the Least Minimal Common Ancestor(SLCA)semantics.Based on the research of SLCA semantics,this paper proposes two shortcomings of SLCA semantics for single-keyword query and poor return granularity.In view of the lack of semantics in SLCA,this paper proposes the concept of meaningful node based on the semantic structure of XML document,puts forward the algorithm of semantic improvement for SLCA according to this concept and adds screening and processing of SLCA semantic node.In the experimental part of this paper,the accuracy of the results of SLCA semantics and the improved semantics is compared,which verifies that the result returned by the improved algorithm is more reasonable and more in line with the needs of users.On the basis of researching the sorting of XML documents based on keyword query results,this paper first analyzes the existing query result sorting models and methods,the shortcomings and the semantic features of XML query results.Based on these analysis,this paper proposes a ranking method of XML documents based on the semantic structure of keyword query results.The method takes the attributes of the nodes in the returned results,the degree of correlation and the attributes of the nodes into account to evaluate the relevance between the returned results and the keywords.Experiments show that the ranking method is superior to the SLCA semantics in terms of precision,which improves the sorting position of the keywords and returns the results more accurately and more in line with the demand.In an XML document multi-style reorganization model,the semantic structure mapping table of the delivery document is generated by analyzing the structure of the XML document content segment,and the final delivery publication is generated through the mapping table rendering.The hierarchy of the final delivery document is determined during the generation of the semantic structure mapping table.When rendering the final deliverable document via the mapping table,the XML document fragment is converted into a fixed-format XML document through preprocessing,and then the style of deliverable publication is selected as needed,and the final deliverable publication is generated by the XSLT transformation.
Keywords/Search Tags:XML, retrieval, ranking, content reorganization
PDF Full Text Request
Related items