Font Size: a A A

Research On Standards Conformance Testing Of Traffic Information And Its System Development

Posted on:2018-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:J B CaoFull Text:PDF
GTID:2348330536984927Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Document compiling auxiliary system has an important significance for improving the efficiency of document compiling.The current commercial document compiling auxiliary system mainly focuses on either the article structure or the article material,however,research on semantic understanding base is still small-scale and lower intelligent.For the reasons above,here in this dissertation,some views are put forward in semantic understanding based research of the document compiling auxiliary system.The main research results:1)A algorithm of Chinese paragraph similarity calculation which bases on weighted bipartite graph matching is proposed.The algorithm ignores the position and order of the paragraph elements,can handle some common phenomenon in Chinese paragraph,such as the synonym replacement,the inverted structure and the flashback when describing a theme.Compared with the vector space model?VSM?algorithm,the accuracy of the similarity calculation in Chinese paragraph can be greatly improved by the brand new algorithm.On the basis of the weighted bipartite graph matching algorithm,a semantic search engine which conforms to the system is devised and implemented.The accuracy of the recommendation for the materials is promoted from the character string and character pattern matching,as well as the semantic understanding,to the sentence and paragraph comprehension.The semantic search engine of this dissertation comprises four parts: the preprocessing module for restrievement,calculating module for semantic similarity,the index module for intelligence constructing,and the sorting module,respectively.Here,in the preprocessing module for restrievement,a vertical unitary word segmentation system is established,based on the HMM autonomous training word segmentation system.2)Based on keywords,a crawler program is devised and implemented,which guarantees the system's database self-renewal by automatically crawling the text material and the semantic cleaning material,likewise,importing lot-size mongodb database.3)A test platform is set up for the algorithm of text material automatic classification is set up.Meanwhile,a performance detection platform for a set of commonly used classification algorithms is also set up.In order to select the most suitable text classification algorithm for the system,here through the test and performance detection platforms,some industrial commonly used classification algorithms such as lr,bayes,tree,extratrees,bagging,adboost,svmnusvc,svmlinear,svmcrbf,Forest50,forest100 are tested by selecting features and adjusting the parameters.4)In order to meet the requirements of the design,with the integration of each function module,we finally complete the software development work of the document compiling auxiliary system which bases on the semantic understanding.The tests show that this system,to some extent,can offer assistance in traffic documents compiling.Meanwhile,the system's accuracy of meterial recommendation is also higher than the general document compiling auxiliary system,with a deeper understanding of users' intention.
Keywords/Search Tags:document compilation auxiliary systematic, natural language comprehension, weighted bipartite graph, semantic search engine, text classification, paragraph similarity
PDF Full Text Request
Related items