Font Size: a A A

Research Ontology-Based Web Information Integration In Manufacturing Domain

Posted on:2009-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:F R KongFull Text:PDF
GTID:2178360245959620Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet and Web technology, WWW has been developing as a giant dynamic information serve network which includes many information sources and websites. It provides valuable information for the users. People can easily access to all kinds of information using Internet. However, the Web information is unstructured and has no clear semantics, which makes the users find it difficult to get the useful information. The current search engine is keyword-based match, which makes the users cannot find the information fast and exactly. In order to use the information sources on the Web effectively, people put forward the concept of Web information integration and make the Web become the knowledge base used by people in any time. It offers the completely transparent, intelligent and uniform information access interface for users.Ontology is primarily used to describe the essence of the things. With the development of the artificial intelligence, ontology is redefined. In the Web information integration, ontology is often used to standardize the concepts and terms in a certain field or several fields, providing a uniform concept and term standardization for the heterogeneous Web data sources. Therefore, it decreases the semantic conflict due to the Web data sources adopting different names. In some cases the problem of semantics heterogeneousness is solved. Furthermore, it improves the accuracy of the system and provides more valuable information for the users.This thesis is a part of the Guangxi Science Research and Technology Development Project (NO. 0719001-11). Taking the Web information of automobile in the manufacturing domain as an example, the thesis researches and develops an ontology-based automobile Web information integration. The whole research is in the clue of ontology-based Web information integration, including the construction of domain ontology, semantics-based Web information extraction and search. The main tasks and contributions of the thesis are as follows: First, we construct an ontology model of automobile information domain by using the OWL DL language. After the analysis of the features of the Web sites, according to the characteristics of the transferring from the website to the DOM tree, we locate the pages regional positioning and extract the text content of the website by using the columns key words of ontology.Second, the extracted text content of the pages is extracted as the semantic website information. On the basis of the traditional vector space model as well as the domain ontology, the concept eigenvector is generated according to the weighing of the concept eigenvector combining with the level structure feature of the ontology. And thus the instances of ontology knowledge base are created semi-automatically.Making use of ontology, the concept eigenvector has more exact semantics. As a result, the vector model dimension is reduced and the computational complexity is decreased as well. What's more, the text of non-structured web page turns into the semantic structured information that can be understood by the machine.Third, on the basis of the established domain ontology, this thesis designs an ontology-based query reasoning algorithm. On the basis of the OWL ontology, the algorithm makes use of the logical reasoning mechanism to extend the concepts of querying key words and to match the instances in the end. In order to show the most satisfactory result for the users, this thesis evaluates the weighing of the extended concepts and designs an ontology-based similarity sorting algorithm. The effect of this algorithm is better and more effective than those of the traditional Vector Space Model.Finally, according to the key techniques of our research, this thesis achieves a Web information integration platform prototype of an ontology-based automobile information domain. The platform applies the reasoning service providing by the DL reasoning engine to realize a semantics-based Web information extraction and query reasoning. Moreover, it evaluates and analyzes the system and turns out the results. The results show the technical feasibility of these methods, and these methods have the future of practical applications.
Keywords/Search Tags:Ontology, OWL, Web Information Integration, Information Extraction, Query Expansion
PDF Full Text Request
Related items