Font Size: a A A

Study On The Extraction Of War Events In Zuo Zhuan Based On Mixed Approaches

Posted on:2020-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z K LiFull Text:PDF
GTID:2506306314495894Subject:Books intelligence
Abstract/Summary:PDF Full Text Request
Currently,with the rapid development of science and technology,people began to use computer to deal with natural languages.The researches on ancient texts also emerge in endlessly,which mainly include the digitization of ancient books,automatic word segmentation,part-of-speech tagging,and entity recognition.All of the above technologies are of great help to the information extraction of ancient texts.Information extraction refers to extracting the events which a particular user is interested in from the unstructured information and then present them to the user in a structured manner.At present,information extraction has been successfully applied in many fields of modern text,and its application in ancient texts is also constantly advancing.Extracting the event information of different categories in the ancient text and storing it in the database in a structured form has great significance for the subsequent data mining and display research.As an important literature in the Spring and Autumn Period,Zuo Zhuan,whose depiction of war is a rather good model,is extremely important both in terms of literature and of history.This article takes Zuo Zhuan as the corpus,and conducts information extraction research on its war events.The main content of the research include:the basic framework construction of war events,the extraction of sentences about war,the identification of named entities,and the visual display.The whole research is guided by the framework theory.First,a basic framework system for the war events of will be constructed,and then it will be used as a template for the entity values identified in the text to match their corresponding attributes.The information extraction of war events needs to first extract the sentences describing the war from the entire corpus.This mainly adopts the method of pattern matching--Firstly by constructing the trigger vocabulary,the candidate war sentence set is filtered,and then through a series of rules,the war sentence is extracted from the candidate set,and thus the war sentence collection is obtained.While recognizing the named entity of the extracted war sentences,according to the previously defined framework,we need to extract the time,the two parties,the location,the cause of and the results of the war as well as the reinforcements by using the Conditional Random Field model,combined with contextual features,part-of-speech features,markup features,and indicator features,and conducting multiple automatic entity recognition experiments to obtain these entities.Finally,after the entity value is filled in the corresponding attribute,it can be stored in the database in a structured form.Then we use E-Charts tool to take out each war’s parties from the database and present a dynamic display of each war on the map during the Spring and Autumn Period.
Keywords/Search Tags:Zuo Zhuan, information extraction, event extraction, entity identification, conditional random field, E-Charts
PDF Full Text Request
Related items