Restricted Ontology Similarity

Posted on:2009-03-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z W He

Full Text:PDF

GTID:1118360245970111

Subject:Computer application technology

Abstract/Summary:

With the development of Internet, more and more regions are covered by Internet and the information explosively increased in Internet. It is difficult to fetch correct information from the large numbers of information even if with current technology such as network directory or search engine. The aim of the Semantic Web is to find a way by which computer can understand the purpose of human, analyses the Web information and find the correct information for human.Semantic Web is not a independent concept but a extension to current Web and it contains knowledge description, ontology and agent. Ontology is the collection of concepts with the property of concepts and the relation among concepts. The ontology in Semantic Web is fundamental by Description Logic, so it support simple induction. The simple induction makes the description capability of Semantic Web stronger than traditional Web. On the other side, computer can exchange knowledge by ontology, even then humankind or expert can communicate with computer by ontology.The core of Semantic Web is ontology, so the research of ontology application to Internet information is a important domain currently, which include ontology annotation, ontology collaboration, ontology construction and machine learning based on ontology. There are some domain that has not been involved:1) The application of ontology is scarce for most of the ontology application is in the stage of ontology theory research.2) The content of current ontology application can not describe complex information.3) The ontology application is found on the computer comprehension to ontology. However the research on ontology computer comprehension is absent.Focused on the comprehension of computer to ontology, this paper provides a new description theory to information with ontology and employs the theory to ontology similarity match algorithm and ontology application.The application of ontology is not only for the communication in a restricted region between computers, but for different region. Therefore ontology plays an important role in knowledge presentation domain. The system based on ontology contains much ontology, so the understanding to ontology is the core for computer to understand Semantic Web. The operation between ontologies is based on the similarity between ontologies or parts of ontology. Ontology similarity algorithm mainly focused on the entity in ontology, and compares the similarity between entities form graph theory, characters or syntax aspect. The application views of ontology similarity include ontology mapping, ontology integration, ontology comparison, ontology extension, ontology modularization and service finding or combination among Web services.Present ontology similarity algorithm computes the similarity mainly from seven aspects as following:1) Character string;2) meaning or natural language;3) The comparable properties of atom concept.4) The type of atom concept and the relation between atom concepts5) Ontology construction or ontology graph;6) Ontology induction;7) Machine learning;8) Application view;Ontology similarity is at the beginning and there are so many aspects that need to be defined and researched. Only when defined the ontology similarity completely, computer can understand information and communicate with each other on the fundamental of the ontology description to information. Otherwise, the performance and the Quality of Service (QoS) in ontology similarity are also empty research domains. The development and efficiency is the key point in ontology similarity.Each document in Semantic Web is an ontology, and these ontologies can be combined to big ontologies. At the same time, each ontology can also be divided to several sub-ontologies. Ontology or a series of ontology can be used to describe the knowledge in an independent domain in Semantic Web, and we call these ontologies Domain Ontology. Domain ontology defines all the basic concepts, the attributes of the concepts and the relation among concepts. On the other side, Restricted Ontology inherit domain ontology and can be applied to describe detailed information, which can't create new class, property or relation because those has been defined in domain ontology.After the presentation to information with ontology, we need to decide positive ontology and negative ontology by ontology similarity. The present ontology algorithms usually attain the similarity by ontology syntax comparison. However, with restricted ontology with the same domain ontology, new and appropriate similarity algorithm need to be used to qualify ontology. Otherwise, current ontology similarity algorithm does not apply the induction ability of ontology because the application of the induction may lead algorithm to infinite circle. The ontology similarity algorithm that we provided makes full use of the ontology induction and avoid infinite circle because we just use the first layer and the second layer induction.Ontology similarity fundamentally serve for service, therefore all comparable property of concepts in ontology and the weights should be decided by application. At the same time, Two-stage definition does not only utilize the ontology induction, but also escape from circle computation.Information extraction is the combination of Natural Language Process and Artificial Intelligent. An information extraction system extracts special information and saves information extracted to database. The key technology in information extraction include: natural language process, naming entity recognition, document analysis and induction, knowledge attainment etc. the process of information extraction include learning and application process. Learning process learns the document set in special domain and testing process applies the learned knowledge on undiscovered document.The present information extraction algorithms are all based on the document annotated by natural language process tools, and these algorithms can be divided as rule learning, classification learning, and statistics learning by learning process. The three classes are dependent with each other such as rule leaning may also use statistics in finding the best rule.A new information extraction algorithm called OERM is provided, which describe the document annotated by natural language process tools again with ontology to extend the relation in annotated document. After that, machine learning is employed to learn these ontologies and apply the knowledge to new documents. In this paper Support Vector Machine (SVM) and Artificial Neutral Network (ANN) are used as machine learning tools.The key of the algorithm is the ontology induction and machine learning and the main parts are ontology presentation and ontology comparison. The extension to source document by ontology is not concerned in traditional algorithm by rule or statistics. Otherwise, the simple ontology induction can induce new relation from the restricted ontology. Finally, SVM and ANN are suitable to these data sparseness application, which contain considerable data noise.The experiment results showed that our system had a compatible performance. It outperformed the recent information systems. Otherwise, OERM can learn enough knowledge rapidly and provides good performance and a shape learning curve on small training set. That means OERM dig more internal relation to overcome the data sparseness in small training set and the data noise is filtered by machine learning. However, the recall of the algorithm is relatively low; therefore the extension to undiscovered information is key point that should be improved. We also apply restricted ontology theory to Web Service Search, and we build a Chinese information extraction experiment system to verify our algorithm.

Keywords/Search Tags:

Ontology, Ontology Similarity, Restricted Ontology, Ontology Combination, Information Extraction, Machine Learning

Related items

1	Research On Some Issues Of Ontologies
2	A Study On Web-Based Domain Independent Ontology Learning
3	Study On The Theory And Practice Of Ontology And Ontology-based Agricultural Document Retrieval System--Floricultural Ontology Modeling
4	Research Of Ontology Approaches And Their Applications In Spatio-temporal Reasoning
5	Research On Product Design Ontology Management And Application Based On Semantics
6	An Approach For Measuring And Comparing Structural Semantics Of Ontologies Based On Graph Derivation
7	Research On Automatic Ontology Construction Based On Relational Database
8	Ontology Learning Similarity Algorithm In Sparse Vector And Multi-dividing Setting
9	Imprecise Ontology Merging Research
10	Ontology-Based Structured Information Extraction From Web Pages