Font Size: a A A

Research On The Construction And Storage Method Of Chinese-Thai Bilingual News Event Chain

Posted on:2019-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:H K GaoFull Text:PDF
GTID:2438330563957693Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of C hina "The Belt and Road China" strategy,vigorously develop along with economy,politics,culture and other aspects of cooperation.Thailand as one of the "The Belt and Road along the country,so the exchanges and cooperation between the two countries has been strengthened.As one of the channels for the two peoples to obtain information,they are all kinds of news resources on the Internet.However,network news is of great quantity and limitless nature.How to quickly and effectively get information about news events from a large number of random news reports has become an urgent problem to be solved.Therefore,it is very important to study the construction and storage of bilingual event chain.In this paper,we have carried out related re search on the construction and storage of the bilingual news event chain:(1)Chinese and Thai news event elements extraction method based on dependency tree and rule.First of all,the characteristics of Thai language are analyzed.It is found that the ma in characteristics of Thai language are attributive,adverbial and complement,while others are similar to Chinese grammatical structure.The study found that the C hinese and Thai dependency structures are the same,so the Chinese dependency tree is constructed and the C hinese dependency structure is obtained by mapping the C hinese syntactic structure.Secondly,some rules are defined according to the characteristics of Thai language,and the subjects,objects and adverbial sentences are extracted by constr ucting dependency tree and the rules defined in this paper.Finally,it is proved by experiments that the combination of dependency tree and the rules defined in this paper can extract Thai news event elements better.(2)Based on the Han Tai vocabulary chain,we build a chain of events.The large number of event triggers in news events leads to many elements of the event involved,so it is rather difficult to construct event chains directly.It is found that lexical chains are closely related to event chains,and lexical chains can trigger chain of events.However,there is a polysemy in both C hinese and Thai,so it is necessary to judge whether the event sentence contains ambiguous words before constructing the lexical chain,and if the ambiguity word is d isambiguation,the disambiguation is not needed if it is not.After that,the initial construction of the original vocabulary chain is carried out according to the candidate word algorithm proposed in this paper.Secondly,according to the characteristics of the news feature extraction,the strongest vocabulary chain is obtained by optimizing the original vocabulary chain.Finally,according to the algorithm proposed in this paper(the relationship between the vocabulary chain and the trigger word),the word chain is completed in the given language.The feasibility of the method is verified by constructing the component chain.(3)A storage method of event chain based on Ne04 j graph model.In view of the semantic relevance and continuity and heterogeneity among the event chains built,this paper proposes a RDF data storage method of event chain in the news domain based on Neo4 j graph database.This method analyzes the connection between the RDF directed graph structure and the Neo4 j map data storage model in the news field,and then gives out the relationship between the event chain and the data storage model of the data chain.The mapping relationship between RDF diagram and Ne04 j graph model is finally described.Finally,the news event chain data represented by RDF is stored in the graph database Ne04j.
Keywords/Search Tags:Dependent tree, Thai language, Event element, Lexical chain, Event chain, Ne04j map database
PDF Full Text Request
Related items