Font Size: a A A

Study And Application Of Chinese Sentence Structure Clustering

Posted on:2019-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:J H LiFull Text:PDF
GTID:2428330545954098Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The related data of Chinese corpus have been widely used in several related fields,such as language research,language education,artificial intelligence and so on.With the rapid development of modern technology and Natural Language Processing big data technology,data analysis technology research in these areas of Chinese related,especially for data analysis such as grammar analysis,semantic analysis,pragmatic analysis of data demand grow with each passing day.In recent years,with the update and iteration of computer,the algorithm of Chinese research has made a significant breakthrough.The analysis of Chinese sentence patterns has been significantly improved in accuracy and accuracy.Sentence structure analysis is of great significance in Chinese information processing.The use of big data and data mining technology to analyze and process Chinese sentence structure is a new perspective and breakthrough point in the field of Chinese information processing.Chinese sentence library also brings new opportunities for the research in this field.Therefore,in order to concentrate these huge amounts of Chinese texts on a personalized processing platform,which is easy to handle,easy to manage,expand and store,we have done the following works.1.A large amount of text data is syntactically analyzed,and text sentence structure information is extracted.2,based on dependency tree structure,we extract features of sentence structure information to extract the main information from the information and list its main structure.In order to simplify the spatial complexity of the main structure,two protocols are proposed to deal with the data.3.Through the above methods,the sentence structure corpus is constructed based on the training text,which is called the sentence library for short.4,based on sentence database,we carry out a series of applied research,such as computing sentence similarity between texts based on similarity computation formula,analyzing sentence structure characteristics of all kinds of sentences based on sentence database,analyzing common sentence structure features of classic novels.5,an extensible,reusable and visualized sentence structure analysis and processing platform is built.The innovations of this research are as follows:firstly,cloud technology is used to transform research focus from text processing to text clustering,and improves efficiency and accuracy.Second,Based on the concept of dependency statement tree and the features of Chinese sentence structure,the Chinese sentence is extracted from the main sentence,and the improvement method is put forward on the basis of the primitive sentence tree.Third.The research on the application of Chinese sentence structure database is relatively rare at home and abroad.This paper will explore the characteristics of the sentence structure from the direction of data mining.
Keywords/Search Tags:Dependency tree, dependency parsing, LTP, sentence database, sentence retrieval platform
PDF Full Text Request
Related items