Font Size: a A A

The Research On Construction Of Conceptual Network From Dictionary

Posted on:2011-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:R F HuangFull Text:PDF
GTID:2178360308952408Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Semantic information plays an extraordinarily important role in processing of informa-tion. Nothing can be done in analysis and understanding of natural language without supportof semantic information. As one kind of representation of semantic information, semanticrepository has become a basic resource in the field of Natural Language Processing. How-ever, most semantic repositories are built manually, so sizes of semantic repositories arelimited to time and cost. If we can make sure the quality of semantic repository, there willbe much more advantage in time and cost to build it automatically.This paper studies how to extract semantic relationships. However, an excellent syn-tactic parser is usually inaccessible and string patterns cannot express complex structuralinformation adequately because of their roughness. This paper studies how to build identi-fication methods automatically by specific-features-based statistical technique, and how touse them to identify semantic relations. The major works include:Firstly, this paper proposes how to construct various kinds of features, including wordinformation, syntactic information, semantic information, location information and some oftheir combinations. Because of variousness of feature types, a uniform representation isapplied. In order to reduce noise, t-test is used to identify valid features, and then t-test isused to find useful word matchs.Secondly, to select features more effectively, this paper combines a priori knowledgeto the statistical model by introducing priority. And then, this paper uses information gainand odds ratio to select features to construct rule sets, making sure that each rule of a ruleset provides high precision and a rule set, with all rules of it working as a whole, provides agood recall.Thirdly, because of some inerasable factors, it is impossible to judge if a semanticrelation exists by only checking if a word owns some kinds of features. Therefore, anti-features are introduced. For each kind of semantic relations, a rule set and an anti-feature setare constructed as an identification method to identify semantic relations. Fourthly, after semantic relations are identified by these identification methods, thesesemantic relations are connected as a conceptual network which makes a lot of word pairsdisconnected originally are connected indirectly now and then more value is available.Finally, to check the effectiveness of the way this paper proposes, a random sampleof each kind of semantic relation is hand-checked. However, because of arbitrariness andambiguity of personal hand-check, this paper evaluates the result indirectly by determiningword pair similarity using path patterns. And the similar word pairs and dissimilar wordpairs are generated from a thesaurus.The research of this paper makes us move on towards the autoconstruction of conceptualnetwork. If a complete and precise conceptual network is attained from dictionary, then wecan lay a solid root for further natural language processing.
Keywords/Search Tags:semantic relation, machine-readable dictionary, word similarity
PDF Full Text Request
Related items