Font Size: a A A

Semantic Web-based Automatic Semantic Annotation Of Junior High School Mathematics Research And Implementation

Posted on:2015-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z S HeFull Text:PDF
GTID:2308330473951815Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an integral part of people’s daily life, the development speed of internet is beyond people’s imagination. The exponential growth of Web, also called ever-expanding data which contains valuable information, is not easy to find. And we cannot use it or manage it properly.In order to make these documents dig out the semantic information by computers, and to provide more accurate services for web applications, these practical needs agree with the semantic Web without previous consultation. As an important technology of the development of semantic Web, semantic annotation affix semantic tags to the massive documents of Web. With specific domain ontology, semantic information was added to Web documents. In other words, the semantic information published by Web documents is the semantic annotation.This thesis mainly marks for the documents of the junior high school mathematics field which belongs to Chinese Web. Firstly, it introduces the status development of semantic annotation technology, including the semantic web technology, Chinese segmentation of the word, semantic annotation, and other related theories and technologies.Secondly, in order to filter the annotated documents, the paper proposes an ontology-based distance classification method to calculate the documents’ semantics degree of polymerization. After transforming the web page text into a structured document, through the use of Chinese segmentation of word, the structured document will be transformed into the vocabulary document. Through the use of the junior high school mathematics ontology network diagram, the ontology distance of the professional vocabulary between vocabulary documents can be calculated.(The professional vocabulary mentioned here and which will be mentioned later both means junior high school mathematics professional vocabulary.) When the ontology distance between words are within a certain threshold, these two words are classified as the same cluster. After classifying these professional vocabulary, the proportion of the top k cluster of words in the whole vocabulary documents will be calculated which also means the semantic degree of polymerization. A higher degree of polymerization will be regarded as the junior high school math topic pages, and the document will be discarded if there is a low degree of polymerization.Thirdly, the precursor accumulated statistical algorithm based on ontology is proposed, and the deep semantic information of the vocabulary document will be extracted. As semantic annotation, the extracted semantic information will be added to the structure document. After selecting the document which we want, the professional vocabulary in vocabulary documents will be counted to do precursor cumulative statistics to calculate the frequency of all the professional vocabulary in the document. Then the few words which are selected by special algorithm regarded as semantic annotation information are added to the original structured document as the form of nodes. And the automatic semantic annotation of the document is realized eventually.Finally, a system for automatic semantic annotation of middle school mathematics of the Chinese web page has been completed, on the basis of the algorithm it has been mentioned above to achieve the core modules of the system- automatic semantic annotation modules and other functional modules, and compared before labeling with after labeling the effects and advantages which brought about.
Keywords/Search Tags:Semantic Web, Ontology-based distance classification, Semantic degree of polymerization, The ontology precursor statistics, Semantic annotation
PDF Full Text Request
Related items