Font Size: a A A

XML Search Engine On The Basis Of The CWM Data Source

Posted on:2005-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:S NiFull Text:PDF
GTID:2168360155471803Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the fast and continuous growth of the information on the network, the requirements concerning the intelligent of the search engine become even higher, the research of intellectual search engine is divided into several main directions, and the technology of structurization search which based on semi-structure language such as XML is exactly one of them. Since the XML's label system still lacks unified standard on using all kinds of application at present, the research of XML search engine in common use gets into a difficult position. In this paper, the author choices the research direction to the development of the XML search engine that directs against the specific field standard DTD (Document Type Definition), tries to confirm one set of commonly method to improve the performance of search engine by using DTD information.CWM has been put forward for solving the problems concerning the alternating and intergrated of the metadata among the data warehouse tools. The exchange form of CWM metadata is exactly XML file, and its XML file's DTD is standardization and uniqueness because it is made by OMG. This paper chooses the field of CWM metadata as concrete target, analyses and studies XML search engine which standard DTD based on the specific field in general way.Detailed analysis and studying to CWM DTD at first, this paper puts forward five sector coding method of DTD and corresponding XML coding method. Through the coding method, it conbines XML file with DTD closely, which causes the structure inquiry of XML file firstly transform to inquiry of DTD and thus improves the search efficiency.Secondly, based on the structure information of DTD, this paper improves the traditional information retrieval calculation technologies of the document relativity which makes the structure information of XML file can be even better utilized by the search engine under the way of content searching, which improved the search engine's intelligent.Finally, this paper redesigns the realization way of the index and search of files based on five sector coding method of DTD, And 4 main components of the search engine in this article --data collector, indexer, searcher, user interface has been all improved correspondingly. Thus a new prototype of XML search engine has been put into structure.Meanwhile, this tool is a query subsystem of management system of CWM metadata, and it has perfected the system of the whole administrative system.
Keywords/Search Tags:Search engine, XML, CWM, DTD, codeing method, relativity calculate
PDF Full Text Request
Related items