Font Size: a A A

Study On The Transformation Echnologies Of Unstructured Data In Enterprise Content Management

Posted on:2011-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:L M DingFull Text:PDF
GTID:2178360308950285Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Enterprise Content Management (ECM), the technologies for managing unstructured data, is becoming an important aspect of information management technologies along with the rapid increasing of unstructured and semi-unstructured data amount and customer needs. Most major ECM providers have their own transformation technologies and products for unstructured data in their ECM products. However, these technologies and products primarily support a limited number of document types, and are in short of flexibility and customize ability in transformation process. Therefore, they are not widely applied in ECM. In addition, there are a mass of Page Description Language (PDL) documents generated by other systems and software. To manage these unstructured data by ECM effectively, a high automated and performance transformation tool is needed.To solve the problems of existing transformation technologies and products, and the outstanding needs from customers, this paper proposes a general document model to facilitate the transformation from one type of unstructured data to other types by studying many different types of PDL document, which is the most common-used class of unstructured data. First, the paper introduces common types of PDL document, and compares their merits and demerits. Then, it gets the major common elements of PDL documents through analyzing them, and designs the general document model with a same interface by offering UML (Unified Modeling Language) class diagrams. The general document model should contain all content of different types of PDL document. Then, the paper designs a transformation system framework based on the intermedial data model by using workflow and multithreading technologies, which make the framework has the merits of high flexibility and higher performance. Then, the paper descripts the core modules of the transformation system framework, and presents its work principle, including preparing system configuration parameters and transformation job project files, starting system, and preparing, starting, running, ending jobs, and the finial step, shutting down system. In addition, the paper introduces the brief designs of system track and report, and presents its XML (eXtensible Markup Language) parameters and intercommunication with GUI (Graphical User Interface). Next, the paper descripts the component design. The object of the design is to implement the high flexibility and extensibility of the system. To approach the object, the paper defines the major internal data structure and methods of the basic component class, and explains the detail of component work principle, including component starting, running in single-threading and multi-threading modes, stopping, and releasing. Then, it classifies components as the three major types: input, process, and output, and shows the different characters of the three component types. In addition, the paper gives the introduction of the transformation system prototype based on the above designs, including the prototype objects, the design of outcome, develop standards, tools, and phases. In addition, it presents the prototype's simple GUIs and brief descriptions. Then, it evaluates the prototype by comparing with the existing similar product and new benefits. By the evaluation, the paper concludes that the prototype solves the major demerits of the existing product very well, but still needs to be improved in transformation performance and output qualities, and lists some problems and improvement issues. At last, besides giving the summary of whole paper, the paper also discusses about applying new technologies to the new transformation system in future.
Keywords/Search Tags:ECM, unstructured data, transformation, PDL document, workflow, multithreading
PDF Full Text Request
Related items