Font Size: a A A

Research And Realization Of Document Process Technology Based On XML

Posted on:2007-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z WuFull Text:PDF
GTID:2178360182484225Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Recently, many cooperations and organizations's scale became larger and lager. At the same time, they should mine their documents within them for information and knowledge sharing. Documents in cooperations often came from all kinds of heterogeneous data source. It became a research focus that how to extract and protect the information in the documents. The research about document processing has obtained a general and effective acquisition, but currently it still has a few problems: how to abstract semantic information from a document, how to divide abstract data from a HTML document solely and how to protect the sensitive data in a XML document exactly.In order to solve these problems, after researching mapping relation between any kinds of texts and XML, the title propose the flow processing model based the analysis of documents's strcture and a method which could parse the Word, Excel and HTMlLdocuments and convert them into well-formed XML documents. Compared to algorithm improvement, the paper research a way of document processing to impove the efficiency of the text mining.Last, it implements XML document's encryption.The paper used Java programming techelonogy, XML programming techelonogy, XML security standard and security access policy. And inherit and develop many open-source projects to implement the system's two functions: document exchange and document security. The detail of the system's analysis, design and implement is described in the paper.The aim of the paper is to solve some applied problems and apply to the project which name is knowledge mining in macro-layer. At the same time, it can accumulate some experience for the next job. After we integrate the whole project, the main package of this system will act as Java Beans in the last Web Application.
Keywords/Search Tags:Document Exchange, Data Extraction, XML Security
PDF Full Text Request
Related items