Font Size: a A A

The Research Of The Author Problem Over 'A Dream In Red Mansions' Using Machine Learning Methods

Posted on:2019-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2415330548973319Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
A Dream in Red Mansions is the gem of Chinese traditional literature and is arguably the representation of Chinese culture.However,the author of this book has yet been determined since its publication 200 years ago and this problem remains a research hotspot.Many researchers have tried to solve this problem and various conclusion have been proposed,the most popular one among which is that the first 80 chapters and the last 40 chapters were wrote by two authors.Most of these researches are based on traditional statistical methods like hypothesis testing,thus the conclusions with regard to the author(s)of A Dream in Red Mansion is not conclusive.With the advent of the big data era,this thesis tries to reanalyze A Dream in Red Mansion using new text mining methods with R.The first 100 highest-frequency words in this book is firstly extracted to be analyzed.In the following steps,these data are analyzed using Bagging,Adaboost,Rotation Forest to try to analyze the author of this book.The final results of Bagging,Adaboost and Rotation Forest coincide with each other and they all show that the first 80 chapters and the last 40 chapters differ in writing style.This result support the conclusion that A Dream in Red Mansion was wrote by two authors.
Keywords/Search Tags:Bagging, Adaboost, Rotation Forest, Text Analysis, Author of "Dream of Red Mansions"
PDF Full Text Request
Related items