Font Size: a A A

Research And Implementation Of Natural Language Based SPARQL Construction Method

Posted on:2016-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:X Q GaoFull Text:PDF
GTID:2348330488473938Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the growth of web data, the requirement of auto search results is increasing. Because of the wide coverage and highly structured characteristics, RDF can be the source of question answer system. SPARQL can be used to search in RDF dataset. How to use RDF dataset and SPARQL language on question answer service is a big challenge. Question answer system can answer question according users' inputs. It contains knowledge base construction, graph storage and index, and result search. Through studying the third part, we transform the natural language to SPARQL and get the results.Based on above background, we design and develop a high availability service about transforming natural language to SPARQL. Through exploring the natural technology, including relation pattern extraction, name entity recognition and search engine technology, we use correct spelling, word restore, questions parsing, rule extracting, resource mapping and auto SPARQL generating method. Specific research has the following several aspects:Firstly, the paper combing with the characteristics of syntax analysis tree and SPARQL syntax, it extracts 10 mapping rules from syntax analysis tree to triples. Through building the syntax tree for 120 standard questions, we extract the mapping rules and build the rule mapping algorithm. Through the research of relation pattern extraction technology, it proposed an optimized mathematical model of relation pattern extraction. It implements the precision and help to extend triples.Secondly, the paper proposed SPARQL generation method. By using noisy channel model and edit distance algorithm we can make spelling error correction. We then lemmatize the question and generate SPARQL template. We generate abstract syntax tree by the API of Stanford parser. According to the mapping rule between abstract syntax tree and triple, we transform the syntax tree to triple. Then, we extend the relation pattern of triple and build the index of RDF dataset source which contains class index, entity index and property index. By searching the index, map the triple to relative resource. Lastly, fill the triple to beforehand template and generate SPARQL query. After searching in Dbpedia dataset, it auto check the query result.Thirdly, we make the experiment by using Dbpedia as corpus and the question sets from QALD. We classify the question into three levels according to difficulty level. We calculate precision, recall and F value to verify the correctness and validity of our system. Through the experiment on all modules, our transfer service proved to meet the requirement well.
Keywords/Search Tags:question analysis, rule mapping, relation pattern extraction, resource index, SPARQL generation
PDF Full Text Request
Related items