| Classical Chinese poetry has been passed down for thousands of years of Chinese civilization,and it is also a valuable part of world culture.In ancient times,poetry was not only used for poets to express their emotions,but also reflected the social atmosphere and cultural outlook at that time.Depth research on the field of classical Chinese poetry can help us understand the thoughts of ancient Chinese literati and promote traditional Chinese culture.With the development of digital humanities,data mining and analysis of classical Chinese poetry using artificial intelligence technology has gradually become a research hotspot.However,there is a large amount of scattered information about ancient poems on the Internet,which poses an important challenge for people to effectively obtain information and knowledge.Knowledge graphs can connect different forms of fragmented information to form a structured knowledge base.Besides the existing research on classical Chinese poetry often relies on a large amount of annotated corpus for supervised training.But the annotated dataset usually requires a large number of experts to participate in manual construction,which will consume a lot of manpower and time.The pre-trained language model can learn a good language representation from a large number of unsupervised corpora,and is helpful for subsequent completion of other downstream tasks.Therefore,this project researches the construction and application algorithm of the pre-training model based on the knowledge graph of classical Chinese poetry,including the following aspects:(1)This paper proposes a method for constructing classical Chinese poetry knowledge graph based on sememe prediction.By predicting the semantic origin of words in classical Chinese poetry,the semantic information of words is analyzed,and the semantic relation between words in ancient poetry and modern Chinese is established.Through this method,this paper constructs a semantic knowledge graph of classical Chinese poetry containing 91,152 nodes and 203,395 edges.(2)This paper proposes a method of integrating the knowledge map of classical Chinese poetry into the pre-training model,and trains a pretraining model in the field of classical Chinese poetry based on the classical Chinese poetry corpus and the knowledge map of classical Chinese poetry.The model not only learns the lexical and syntactic features of the poems,but also learns the semantic features of the words in the knowledge map of ancient poems,which provides better initialization vectors for subsequent tasks and speeds up the convergence speed of downstream task training.Experiments show that the pre-training model is better than the baseline model in the task of poetry theme classification and poetry translation.(3)This paper constructs a visualization platform for big data analysis of ancient poetry based on knowledge graph.The main functions include the display of the original corpus of classical Chinese poetry and prose,the analysis of classical Chinese poetry based on words,the display of the knowledge graph of classical Chinese poetry,the analysis of the emotional themes of poetry,and the translation of poetry.The analysis results are presented to users in the form of visual charts. |