Font Size: a A A

Implementation Of Grammatical Analysis Subsystem In Natural Language Information Extraction System

Posted on:2018-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:D L LiuFull Text:PDF
GTID:2518305966950399Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The amount of natural language content in Internet raises rapidly,including e-commerce message,forum message,blog message and so on.The amount is too huge for us to handle by people themselves.Using NLP to process them is a good idea.There are two main NLP methods.The most popular method is statistic-based one and the other is FST-based NLP method.This paper focuses on the second method.It's also called the rule-based method.This paper proposes a rule-based and Finite State Transducers(FST)based NLP method for extracting information from massive text.The method differs from n-gram based popular method which relies on probability statistics and machine learning.In our method,the rules are grammars of a language,summarized by people.FST is the implementation tool of rules.It can process natural language and generate a syntax tree for each sentence.To support applying the rules,we tokenize and generate the stem of words,and find many word features which are recorded in a dictionary.After generating a syntax tree,we extract useful information on many aspects,such as subject-verb-object(SVO)matches and opinion matches.We use a huge baseline to develop our grammar rules.When baseline is very huge,it costs lots of time to run all baseline sentences,and it's difficult to debug the grammar rules.We proposes a debug system to help rule developer debug rules.The debug system can accept several commands,and works as a cluster.It saves time by filter the baseline sentences and run rule files step by step.We evaluate our system on the accuracy rate of the syntax trees,and show that the result is satisfactory.We also show the performance of the debug system.
Keywords/Search Tags:NLP, rule, grammar, debug, big data baseline
PDF Full Text Request
Related items