Font Size: a A A

Semantic Structure Identification Based On PCFG-HDSM Model

Posted on:2009-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:B XuFull Text:PDF
GTID:2178360272977381Subject:Carrier Engineering
Abstract/Summary:PDF Full Text Request
Semantic understanding plays an important role in Natural Language Processing, and semantic structure can not be ignored in Chinese semantic expression. In order to process aircraft maintenance information automatically, and enhance the range of processing to semantic layer, this paper emphasizes the basic role of semantic structure in semantic understanding. In order to understand the semantic information of text, the system identifies the semantic structure intelligently at first, and performs semantic processing secondly. Therefore this paper focuses on identifying the semantic structure automatically.For the purpose of identifying the semantic structure automatically, this paper parses the text firstly, and then converts the syntactic model to the semantic model. In part of syntactic analysis, this paper constructs the syntactic tree library at first, and the processing is: we firstly use the ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System)to perform lexical analysis for the raw text corpus, then ameliorate the result of words parsing to decrease the disambiguity, at last identify syntax information artificially. The system learned 385 rules from this signed corpus, and the rules are used in syntactic analysis processing. To improve the capacity of processing disambiguity and the precision of parser, this paper proposes a syntactic parsing model PCFG-HDSM based on GLR algorithm, the model combines the strongpoint of PCFG(Probabilistic Context-Free Grammar) and which of HDSM(Head-Driven Statistical Models), and we also realize a new syntactic parser for chinese based on the new model. In the opened test, we get the result that label precision and label recall are 80.8% and 74.3% respectively. Compared with the result of Prop program from the Chinese Academy of Sciences, it improves a little. It proves that the new model PCFG-HDSM can improve the capacity of parser's processing disambiguity. In part of semantic structure parsing we process the result of syntactic analysis to get the semantic model. We test a large amount of text and do some statistical work to figure out the distribution information of every semantic model.
Keywords/Search Tags:GLR Algorithm, Probabilistic Context-Free Grammar(PCFG), Head-Driven Statistical Models(HDSM), Probabilistic Syntactic Analysis, Semantic Structure Parsing
PDF Full Text Request
Related items