Font Size: a A A

A Unified Document Model And Its Application On Document Format Converting

Posted on:2011-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z W GuanFull Text:PDF
GTID:2178360305461915Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In this thesis, the significance of documents in information domain and the nature of documents are summarized. The definition of documents and its meaning are clarified in general with comparison among several commonly used definitions. The context and its definition are analyzed and its mission-critical roles with the meaning for document are investigated. A document is divided into its logical dimension, granularity, temporal dimension and spatial dimension. The logical structures of documents are parsed and the relationship and difference between the primary structure and the secondary structure are recognized. The relationship between the granularity of contents and the structure of documents are analyzed. The temporal dimension and spatial dimension are also stated. All of these help to establish a unified document model, which provide a theoretical framework for document processing and representation.Based on the analysis of the document structure, the thesis describes the establishment of a unified document model, focusing on constructing the model encoding format. The model format is based on XML format that is an open format. It contains mainly three parts:the metadata, the contents and the style of the document. This thesis illustrates the way of the storage that text data encoded in the unified document format, and the method of compressing a unified document for the storage of data.As an application, this thesis presents the conversion between the unified document model and ODF documents, PDF documents, HTML documents and TXT documents, and discusses the techniques used and the implemtation method of the converting. Meanwhile, the converting results with actual converting processes illustrate the problems that need pay more attention, and the details of the model need to be improved.
Keywords/Search Tags:Document Model, Document Structure, Context, Format Converting
PDF Full Text Request
Related items