Font Size: a A A

Automatic Extraction Of Uyghur Ontology Concept Based On Dynamic Weighted Multi-startegy

Posted on:2014-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:H N ZhangFull Text:PDF
GTID:2248330398967932Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Uyghur ontology knowledge base is the base of Uyghur semantic information processing,and Uyghur ontology concept extracting is one of the most important basic work of Uyghurontology knowledge base building.Single method or static weighted multi-strategy fusion method which has been proposedcurrently have improved the correct rate of extraction, but facing to various types of corpus inthe practical application, the result of extraction cannot meet the needs of practicalapplications. The main problems are as follow: First, the rule template of current methodsdidn’t cover fully, as a direct result, the extraction rate of multi-word is not high. Second, Dueto the characteristics of the method itself considered and the corpus type are not considered,so the method based on static weighted strategy can’t really play the advantages in dealingwith different corpus, and cannot truly reflect the domain membership of concept. In addition,after survey, we do not find any research on Uyghur ontology concept extraction. If thecurrent method was put into the Uyghur ontology concept extraction, we must do somecorresponding processing.To solve the problem above, a method of Uyghur ontology concepts automatic extractionbased on dynamic weighted multi-strategy integration was proposed. The method adoptsautomatic learning mode to learn Uyghur rules, matches the corpus after stemming andtagging, so, the candidate concept was filtered. And then synthetically considering thecharacteristics and the ability of improved DR&DC, TF-IDF and NC-Value, the three strategyare integrated, sequence the degree of domain membership of the candidate concept sets,finally, put concepts whose weight exceeds the threshold value into final concept sets. At last,after lots of experiments in300computer domain corpus (100thesis,200free texts), theextraction correct rate has reached to89.7%.
Keywords/Search Tags:dynamic weight, multi-strategy, Uyghur ontology, concepts extraction
PDF Full Text Request
Related items