Font Size: a A A

Analysis Of Literary Works Based On Machine Learning

Posted on:2023-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:D Y JiangFull Text:PDF
GTID:2555306914460234Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,various machine learning technologies including deep learning have been increasingly used in the field of natural language processing.The evaluation indicators of natural language classification tasks,annotation tasks,generation tasks and other tasks are constantly being refreshed.Novel is an important part of human literary works,and it is also the one of the largest demand for literary works by the public at present.However,due to the long text,many characters and complex scenes of the novel,the existing technology has difficulties in modeling the long text.Therefore,there is a lack of in-depth semantic analysis and research on its characters,scenes and other semantic information,and how to use the above semantic information to strengthen the exploration of novel continuation.Therefore,this paper selects novel as the analysis and processing object,and uses machine learning technology,mainly deep learning technology,to carry out structural semantic analysis of the novel text,including:1)Aiming at the problem that it is difficult to establish the connection between dialogue and dialogue subjects in the novel at this stage,this paper proposes a subject extraction and prediction technology method integrating rules and generative model.This method first divides the novel text into sentences,give consideration to the context of the dialogue content,and accurately extract the subject information of the dialogue content.2)For the lack of modeling of characters and character relationship attributes,this paper proposes a character and character embedding method based on twin network.This method embeds characters and character relationships semantically according to the character dialogue content to obtain the vectorized character information representation.3)Because the novel text is long and the scene is complex and difficult to model,this paper proposes a scene segmentation sentence recognition method based on language model.This method divides the scene of the novel through the content analysis before and after the scene segmentation sentence,so that it can extract short scenes for modeling.In addition,as an application of the above structured semantic analysis,this paper proposes a continuation scheme based on the structured novel text,which improves the input of the model and adds more semantic information to achieve better context consistency.This paper selects and makes a representative novel evaluation data set,and tests the above scheme on this data set.The experimental results show that the novel structured framework proposed in this paper has achieved good results in the related tasks of multiple modules,and the deeper semantic information of novel text can be get by using this framework.The novel continuation based on structured text also achieved good scores on multiple evaluation indicators,and achieved good results in style consistency,character and character relationship consistency.In addition,this paper designs and implements a novel text structured system,which is composed of text sentence segmentation,subject extraction and prediction,character related embedding and scene segmentation modules.It takes the novel as the input and outputs the structured text with scene as the unit and annotated with characters and their social attributes,as well as the corresponding semantic embedding vector table.Using this system,users can easily obtain the fine-grained semantic information of novels,which can be used as the training corpus for novel continuation and dialogue generation.
Keywords/Search Tags:natural language understanding, neural networks, pretrained language models, natural language generation, novel continuation
PDF Full Text Request
Related items