Font Size: a A A

A Study And Application Of The Management Of The Unstructured Data(mud) Based On Xml

Posted on:2010-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:L WenFull Text:PDF
GTID:2198330338482174Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With social progress and the development of science and technology, especially with the extensive use of the Internet, people have to deal with more and more information. The statistics from Forrest Research shows that only 20% of the information is effectively stored in the structured database. 80% of the information is stored in unstructured data which scatters throughout the business process and external environment. The wide use of computers enables enterprises and organizations accumulate a massive amount of critical business data. The processing capacity, however, is far lagging behind. Unstructured information is an important source for decision-making, government and enterprise decision-makers, therefore, need these data integrated and want to make the best of the unstructured information. Thus, how to effectively manage these unstructured data and mine the value of the internal relationship is the very issue need to be solved.Traditional data management especially that of relational database, is only skin deep. The current way of processing method in unstructured information just focuses on electronic documents, lacking the management of the unstructured information. Moreover, difficulties appear before companies running a number of computer information systems, such as personnel supervision based on relational database, payroll system, WEB information management systems and so on because these information systems are supported by different technologies, dealing with different objects and having different methods of operation, which poses a major problem to the integration of enterprise information. Some kinds of unstructured data management are analyzed in the article. However, these methods are either too complex or less efficient, or very expensive to use. As a result, people need to find an economical, simple and feasible way of unstructured data management. The emergence of XML in management of unstructured data has brought solution to the problem.Through analysis of the structural characteristics of unstructured data, such as word document, excel document, web pages, etc, and read datas by a series of conversion tools, unstructured data will be converted into XML documents, so that the management of unstructured data will change into semi-structured data management.The use of DTD or Schema will convert the XML document to object or relational database to achieve the structure of the unstructured data management.On the basis of the theory of the management of unstructured data, this article does much research into the XML file and the relational database which is generated after the conversion with the purpose of extracting the knowledge on unstructured data. Experiments show that this method is economic, practical, and effective, as is done in the System of Comprehensive Quality Evaluation for middle school students of Changsha.
Keywords/Search Tags:Unstructured Data, XML, Relational Database, Date Storage
PDF Full Text Request
Related items