Font Size: a A A

Research On Ontology-based Semantic Information System

Posted on:2006-09-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:C M YuFull Text:PDF
GTID:1118360182965721Subject:Information Science
Abstract/Summary:PDF Full Text Request
It is very significant to study Semantic Information System (SIS) as a researcher in the field of Information Science. First, it can satisfy our knowledge requirement. Our desire for knowledge is becoming more and more extensive with the development of knowledge economy. Second, information system has the tendency to be fused with the semantic web. As we know, the present Internet has the flaw in the information expression and the retrieval aspect, as it is designed for human-beings to read, not for machines to read. Therefore making the Internet machine-readable would have the huge impetus and the revolutionary function regarding the traditional Internet. Third, It can make up the insufficiency of traditional information retrieval. The tradional information retrieval has met the user's needs in a certain degree based on the key word retrieval way but the result was merely matched with word significance, not concept significance, cause word and concept is not in the same level. But the semantic information retrieval can satisfy our demand for concept retrieval. Fourth, it conforms to the trend that information system will change from syntax and syntactic-oriented IS to the semantic-oriented IS. In the traditional information system researching area, the isomorphic information processing and distributed information processing are becoming the hot spot. The key element to solve these kinds of problems is to improve the interoperability of information system. The semantic interoperability is the core problem of information system interoperability, as "the core of the interoperability of information system will shift from systematic, syntactic and structural to semantic". From these points it is very important to develop Semantic Information System.Based on the current needs listed above, the dissertation defined and developed a new kind of information system----Semantic Information System. Based the ontology technology, the dissertation analyzed the composition of information from four aspects: semantic information description, semantic information acquisition, semantic information retrieval and semantic information output. The author also discussed the process, problem and experience in build a pragmatic Semantic Information System-----the Guomingdang Gongchandang He Zuo Semantic Information System(GGHZ-SIS).1. Introduction to Semantic Information SystemAs Semantic Information System is a totally new concept, this chapter analyzed the definition, constitution, characteristics of Semantic Information System, and compared it with the traditional Management Information System (MIS), Competition Intelligence System (CIS), Decision Support System (DSS), Expert System (ES) and so on. It also presented a prototype of SIS which has five components: semantic information description component, semantic information acquisition component, semantic storage component, semantic information retrieval component and semantic information output component. It should be emphasized that the superiority of SIS will be fully unfold only when it is fused with the semantic web and only when the semantic web is becoming practical, thus we still have a long way to go before we can fully harvest the potential of SIS.2. The Foundation of Semantic Information System—OntologyOntology is the foundation of semantic information description. As we know, the semantic information is mainly composed of the semantics class, the semantic property, the semantic relations, the semantic rule and the semantic instance, which can be mapped to the concept, the concept attribute, the concept relations, the rule and the axiom in the ontology. Ontology is also the reference in the semantic information extraction, as it could help us weigh the important degree of the semantic information. Ontology is also the assisted method in the semantic retrieval process. As ontology itself has certain degree of inference ability, we may use ontology to expand our query and thus causes the result to be more comprehensive; Ontology is also the main form of semantic output. Based on the above four reasons, we may consider that ontology is the foundation of semantic information system, therefore in this chapter the author analyzed the definition, classification, building methodology and especially acquisition method of ontology3. Semantic Information DescriptionIt is not from zero point that we begin to describe semantic information. During the last ten years for the great development of Internet, people have gained a lot of experience in how to describe metadata and Internet data. RDF (Resource Description Framework) is among them. The RDF metadata model is based upon the idea ofmaking statements about resources in the form of a subject-predicate-object expression, called a triple in RDF terminology. The subject is the resource, the "thing" being described. The predicate is what trait or aspect about that resource that is being described, and often expresses a relationship between the subject and the object. The object is the object of the relationship or value of that trait. The Resource Description Framework Schema (RDFS) is an extension to RDF that describes how to define RDF vocabularies using RDF itself. It defines, among other things, two important properties, rdfs: subClassOf and rdfs: subPropertyOf. And then comes the OWL— OWL is an acronym for Web Ontology Language, a markup language for publishing and sharing data using ontology on the Internet. OWL is a vocabulary extension of RDF and is derived from the DAML+OIL Web Ontology Language. After analysis on the current descriptional languages, the author suggested that OWL is the best recommendation.4. Semantic Information AcquisitionThe main task of semantic information acquisition is to extract the semantic instance and the semantic relationship from the unstructured information (Text, Picture, Audio, and Video), semi-structured information and structured information. For the structured and semi-structured information, it is easy to build the map from the formal structural to the semantic class and semantic relation and thus do the transformational work. Thus the most difficult part of semantic information extraction is on how to deal with Natural Language Processing, especially for the Chinese language. In this chapter, the author described several difficulties in Chinese semantic information acquisition and provided a Shallow Parsing-based semantic information acquisition method.5. Semantic Information RetrievalIn this chapter, the author defined the semantic information retrieval as "on contrary of traditional information retrieval, it is a new kind of information method, in which the information input, information organization and searching result all have semantic meaning". Based on the definition, the author gave the details on how to entrust with semantic meaning in the input, organization and output process..6. Semantic Information VisualizationThe main task of semantic visualized output is to show the semantic object andits relation ships to the user. Wehrend has summarized the ways to do the visualization, which include orientation, identify, distinguish, categorize, cluster, distribute, order, compare, associate and relate. But for the SIS, the author reduced these into three key techniques: Zoom/Pan, Focus/Context and Incremental Navigation. Based on this need, the author analyzed several visualization components: TGVizTab, Jambalaya, Onto Viz and OntoRama.7. Design and Realization of GGHZ-SISIn this chapter, the author gave the detail in designing and realizing GGHZ-SIS. The author described the process and result of using OWL to describe the event, persons, location, organization and so on in the filed of GGHZ. The author realized the GGHZ semantic information acquisition component with several steps: splitting the paragraph into sentence, word tokenization and Part-of Speech tagging, selecting the semantic predict, selecting the semantic subject based on the semantic predict, selecting the semantic object based on the semantic predict, pronoun resolution, time correction and location correction and at last updating the semantic extraction context. The author provided the technical details in realizing the semantic retrieval component in GGHZ-SIS. At last, the author provided the details in developing the visualization part of GGHZ-SIS based on TouchGraph.8. SummarizationIn this chapter, the author gave the Summarization.(Diagram 30 Table 19)...
Keywords/Search Tags:Semantic Web, Semantic Information System, Information Extraction, Information Retrieval, Information Visualization
PDF Full Text Request
Related items