Font Size: a A A

Information Extraction Algorithm Based On The Template Matching In Traffic Standards

Posted on:2018-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LiFull Text:PDF
GTID:2348330536484898Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
Standards conformance testing system of traffic information is a tool for searching the difference between two standards.The comparing unit of this system is paragraph,and the operation granularity is too simplistic.This paper presents a information extraction algorithm which is based on the template matching.It can refine the extraction unit,and then improve the efficiency of detection.The specific research is followed:Firstly,we researched the construction of standard template library.Based on the in-depth analysis of the standards presentation,general representations were extracted from a slew of traffic standards,and the general representations were used as templates.To meet common demand,the templates were expanded by their synonyms.After that,all of templates were named one by one and classified by position and the meaning of a word.These templates were sorted to 12 categories and its sum of the items is 142.The second main step of this algorithm is using sentences as the unit to extract information from standards with the operation of template matching.First,we loaded the templates to personal dictionary,and to be matched sentence should be segmented,and be tagged with its POS.Then,the ordered POS was used to build up the POS collection.Eventually,we used the regular expression to match the collection.Afterward,we can sieve the matched sentence.The slots of these templates were filled with the entity words and disciplines.In this paper,the above methods were designed,implemented and tested.Using such algorithm could change the original paragraph-by-paragraph similarity computation of standards conformance testing system of traffic information,into the computation between entity words and entity words,or between disciplines and disciplines,it could increase accuracy and efficiency of similarity detection.
Keywords/Search Tags:Traffic Standards, Information Extraction, Standard Analysis, Template Extraction, Template Matching
PDF Full Text Request
Related items