Font Size: a A A

Speaker Identification And Its Application To Social Network Construction

Posted on:2022-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:H Y DouFull Text:PDF
GTID:2480306326951489Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Characters are one of the three elements of a novel,and dialogue is an important way to describe characters.Characters' personalities,emotions and interpersonal relationships are all reflected in their dialogue texts.The premise of using dialogue to analyze characters is to attribute each quote to a specific character entity.Therefore,this thesis proposes a speaker identification model suitable for Chinese novels.Starting from Jin Yong's novels,this thesis first carries out dialogue text detection,then applies machine learning classification method to identify the speaker of the quote,and finally uses the dialogue chain method to construct the social network of the novel characters,and based on this,analyzes the characters in multiple aspects.This thesis mainly carries out research work from two aspects: speaker identification and social network construction.The specific content is as follows:(1)The speaker identification corpus suitable for Chinese novels is constructed and analyzed.Through a series of text preprocessing methods,the largest Chinese speaker identification corpus to our knowledge,containing 31,733 quotations,has been annotated on Jin Yong's novels.According to the statistical characteristics of quotations,this thesis analyzes the language styles of different characters.(2)A speaker identification model based on multi-feature classification is proposed.Firstly,the dialogue patterns of Chinese novels are analyzed,and the feature templates with good performance are designed,including boolean,distance and statistics features.Then,MLP,SVM and Perceptron models are used for speaker identification at Entity Level.The experimental results show that the F1 value of MLP model reached 90.42%,which was 8.63% higher than that of the baseline model.The feature ablation experiment showed that boolean features contributed the most to speaker identification.In the cross-novel speaker identification experiment on other works of Jin Yong,the F1 value reached 84.04%.Finally,Bert and CRF models are used for speaker identification experiments at Mention Level,and the results show that the sequence labeling method is not as effective as the multi-feature classification method.(3)The construction method of social network of novel characters based on dialogue chain is proposed.Firstly,the definition and segmentation method of the dialogue chain are proposed,and then the social network of characters is constructed by extracting the dialogue chain based on the novel text after automatic speaker identification.Compared with the mainstream social network construction method based on the co-occurrence of characters,the social network based on the dialogue chain method is more concise and accurate.Finally,based on the constructed social network,this thesis analyzes the centrality and community of the characters in the novel.In addition,based on the content of quotations,a cross-novel character clustering analysis is conducted,which provides a novel perspective for the quantitative analysis of fictional characters.
Keywords/Search Tags:Fictional Character, Dialogue, Language Style, Speaker Identification, Feature Templates, Social Network
PDF Full Text Request
Related items