Font Size: a A A

Short Text Similarity Research Based On Abstract Syntax Tree

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:R Z WangFull Text:PDF
GTID:2428330575497365Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of network information technology,text has become the main carrier of information.Therefore,Chinese information processing has become an important research field of Natural Language Processing.Text similarity computing is a research direction in Chinese information processing.It has a wide range of practical applications in intelligent marking,text matching,Machine Translation,information retrieval and other fields.Text similarity computation includes text word segmentation,semantic analysis,syntax rule analysis,construction of text similarity computation model and so on.Through reading a lot of documents about text similarity computation,we sum up,and propose a new idea of using abstract syntax tree to calculate text similarity.This study consists of three parts.First,a Chinese word segmentation method based on minimum information content is adopted.It is based on statistics.When the Chinese text is divided,it does not need to get the prior statistical information of each word in the word library,and can achieve efficient word segmentation under the condition of only the word bank to obtain the result of higher accuracy.Then,the Chinese text is structured.On the basis of Chinese word segmentation,the word segmentation results are analyzed by word analysis and grammatical analysis.The abstract syntax tree based on Chinese text is constructed.Finally,the similarity calculation method of text similarity in recent years is summarized and analyzed,and a short text similarity calculation method based on syntax tree is proposed.This method is mainly based on abstract syntax tree,combined with the calculation method of vector space model,and carries out text similarity calculation to make the calculation result more accurate.
Keywords/Search Tags:Chinese processing, abstract syntax tree, sentence structure, text similarity
PDF Full Text Request
Related items