| Chinese classics record the five thousand years of Chinese culture and carry the blood of China.The digitization of Chinese classics is the information processing and knowledge mining of ancient texts through the use of natural language processing technology.Due to the obvious differences between the grammatical structure of ancient Chinese and modern Chinese and the limitation of corpus labeling,how to extract the structured knowledge contained in the classics with the help of modern information technology has become the primary task of the digitization of Chinese classics.At present,the digitization research of Chinese classics still focuses on the digitization of ancient texts,automatic word segmentation,automatic sentence segmentation,and part-of-speech tagging.There are relatively few researches on knowledge mining and knowledge services in Chinese classics texts,especially the lack of knowledge extraction related research in Chinese classics.The emergence of event extraction has solved the above-mentioned problems.However,the current research on event extraction methods for Chinese classics is still in its infancy and there are many problems.Therefore,this paper is oriented to "Historical Records" and "Zuo Zhuan",and improves the existing mainstream event extraction methods.The main work is as follows:In view of the lack of portrayal of historical events and their relationships,the knowledge graph that is currently widely concerned has been studied in depth.How to quickly and accurately discover these historical events and their internal connections is of great significance for revealing the essence of history and discovering historical laws.In view of this,based on the BERT model and the LSTM-CRF model,this paper proposes a historical event extraction method for the "Historical Records",and builds the "Historical Records" affair map based on this.The experimental results show that,compared with the current mainstream methods,the F1 value of the method proposed in this paper reaches 0.823.Through the affair map,you can discover the little-known knowledge contained in the "Historical Records",which provides necessary data preparation for experts in the fields of philology,history,and sociology to carry out research.Because the texts of Chinese classics and modern Chinese have obvious differences in grammatical structure,they are limited by the problem of corpus labeling.This thesis uses "Zuo Zhuan" and "Historical Records" as the experimental corpus,based on the BERT and Bi LSTM models,and proposes an event extraction research framework that integrates external knowledge.The experimental results show that the F1 value of the event extraction method in this thesis reached 0.874,0.869 and 0.871 respectively,which shows that the method in this thesis can complete the task of event extraction well and demonstrates the research value of this thesis. |