Font Size: a A A

Automatic Extraction Of Uyghur Ontology Concept Classification Relationship Based On Seed Bootstrap

Posted on:2015-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q YangFull Text:PDF
GTID:2298330431991877Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The implementation of Uyghur ontology learning will lay foudation support for theUyghur semantic information processing, and Uyghur ontology concept classificationrelationship extraction is one of the important basic work of the Uyghur ontologylearning.Considering the characteristics of the Uyghur agglutinative language, the method ofobtaining classification relationship between Uyghur ontology concept is an importantresearch field.Among ontology classification relation extraction methods,the adoption of patternmatching method used to extract the ontology concepts classification relationship is simpleand practical with a high accurancy; But the current pattern acquiring model still existssome deficiencies: first, the existing schema self-learning methods need artificial selection ofseed classification relation concept pair, the seeds’ quality of which would directly affect thepattern learning, and the automation of seed selection and the portability of seed remain to befurther improved; Second, some learning methods needt to be accomplished with the assitantof external knowledge base model,usually the expression of which is fixed and the modehas insufficient adaptability in the face of constantly updated semantic web; Third,Adoptingthe pattern-matching methods to extract the classification relationship, it can only extract theco-occurrence conception pair in the same sentence and cannot extract the indirectclassification relationships among concepts that appear in several sentences.At present,it has no reports on Uyghur ontology concept classification relation extractionretrieving the related literatures.Based on the characteristics of the Uyghur agglutination,it isdifficult to copy the existing processing method in the world,and it needs adaptive correlationprocessing in light of the Uyghur characteristics. Taking into account the above issues, thispaper presents a seeds-driven Uyghur ontology classification relationship automaticextraction method. First,it constructed the generalized suffix tree for the extracted Uyghurdomain concepts,extracted the conception classification relations of combination ofwords,and added the relationship into the initial conception classification collection; Selecting classification relation concept pair whose confidence and support is greaterthan a predetermined threshold as seed,it learned the initial classification relational model andjoin into the initial classification relation model collection;It utilized the existingclassification relation mode among conceps to learn more conception classificationrelation,based on which selecting seed conception pair in the iteration phase to acquire newconception classification mode,and all this led to the classification relation and mode siteration learning。In the iteration phase,the mode and classification relation collectionexpanded constantly until the termitation condition satisfied.The method adopted the generalized suffix tree extracting some combination wordsclassification relation which layed foundation for the classification relation seed concept pairand the extracting partial classification relation made up for the deficiency of the merelyextracted classification relation concep pair in the co-occurence sentence.The experimentsshowed that the degree of automation, accuracy and recall had been significantly improved.
Keywords/Search Tags:generalized suffix-tree, Seeds bootstrap, Uyghur ontology, stem extract, taxonomic relationship
PDF Full Text Request
Related items