Font Size: a A A

Construction Techniques Of Terminology Semantic Knowledge Base Based On HowNet

Posted on:2017-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2348330482481590Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Semantic knowledge base is essential to the natural language processing. The traditional method for constructing knowledge bases is mostly artificial approach, which has high accuracy but low efficiency. The scale of the bases is also small and can't satisfy the demands of works in Big Data Era. The rapid development of artificial intelligence also puts forward a higher requirement to the natural language processing, knowledge bases have been extended into many professional domains from general domains. Therefore, how to efficiently and accurately build a domain knowledge base has become a necessary project. This paper puts the rules and the statistical results together to solve the problem, which can greatly improve the efficiency and can also help to deal with the professional information. The work in this paper can be summed up in the following sections:We observed the aerospace terminology features and analyzed more than 2300 pieces of terminology describing information which are constructed by the predecessor, and analyzed the rules that have been formulated in detail. This work makes adequate preparations to the next assistant construction.Based on the above work, this paper proposed a semantic knowledge bases constructing method for aerospace terminology, and summarizes 212 semantic frameworks based on the core words. The other words which are covered by the rules are filled in the frame according to their characters. In order to deal with the words which can't be filled by rules, a statistical method is proposed, by statistic the describing information of 2300 pieces terminologies, to predict the semantic relation of two sememes. By using the method that puts the rules and the statistical results together, most of the sememes in one terminology can be filled in the frame,a complete definition description is formed at last.At the end, by the proposed method, a knowledge base which contains 2000 items is constructed. We choose 100 pairs terminology from all terminology describing information,and calculate the similarity of each pair terminology according to their definition descriptions,then organized 10 graduates to mark the similarity of these terminologies. The correlation coefficient between the computer calculation and the manual annotation is 0.85. The result proves that the construction method which proposed in this paper can not only improve the efficiency, but also guarantee the accuracy. The terminology describing information in the constructed knowledge base has high quality.
Keywords/Search Tags:Terminology semantic knowledge base, HowNet, Assistant construction methods, Similarity calculation
PDF Full Text Request
Related items