Font Size: a A A

Research On Structure Function Recognition Of Multi-level And Multi-field Academic Text Based On Deep Learning Model

Posted on:2022-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y F WangFull Text:PDF
GTID:2518306761991109Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Academic papers are not only the main manifestation of research results,but also an important information source for researchers to communicate and study academically.They are mainly composed of a front part including title,author,abstract,keywords and other elements,and a main part including text,references and appendices.The structural function of academic texts is to describe and summarize the chapter structure and chapter functions of academic papers from the perspective of text content.This structural function can usually be divided into "Introduction","Methods","Results","Discussion" according to the IMRAD model.This structured functional division helps to display the logical structure of academic papers in a finer-grained manner,which is convenient for researchers to conduct deeper research.Therefore,in the fields of library and information,information science,etc.,the identification of structure and function of academic texts has become an important content of knowledge mining in academic papers.At present,most research oriented to academic texts is still in its infancy,and there are many problems.Therefore,for academic literature data,this paper selects academic texts of different levels and fields to build a deep learning experimental environment,and improves the existing mainstream methods to identify the structure and function of academic texts.The main work is as follows:The academic-literature abstract is composed of several structures with specific functions,such as purpose,method,result.There are few researches on the recognition methods of abstract structure function,and the proposed methods performed poor.In view of this,Bi RNN,Bi LSTM,Bi LSTM-CRF and BERT are introduced to summarize the journal articles of 1232 CNKI databases.In our experiments,The 5-fold cross validation is used to avoid contingency,the experiment results are represented by 'average ± standard deviation',which takes the average performance and stability into consideration,the experiment results are evaluated by F1-value.The comparative experiment results show that compared with Bi RNN,Bi LSTM,Bi LSTM-CRF,BERT performs best with highest average and lowest standard deviation,which indicates that this model is quite fit for recognition of abstract structure function.The recognition of the academic literature structure function is an important research hotspot in the knowledge mining and analysis of academic big data.It is helpful to understand the academic literature from a deeper and more fine-grained level through the effective knowledge mining and to promote the development of semantic understanding of academic literature.This paper investigates the recognition method of academic literature structure function based on paragraph by comparing the recognition performance of CNN,LSTM,BERT,and makes a comparative experiment with the traditional machine learning algorithm SVM.The experimental results on CNKI corpus show that,compared with SVM,LSTM and CNN models,BERT model has better structure function recognition performance,and its F1-value reaches 0.66 in the overall recognition performance and 0.79 in the specific structure function recognition performance.In addition,confusion matrix is introduced to analysis the misrecognition.The misunderstanding analysis shows that BERT model can accomplish the task of academic literature structure function recognition.The academic literature structure function is important to improve the performances of information retrieval,keyword extraction,citation analysis and other related applications.Therefore,the automatic recognition of academic literature structure function has become a valuable and practical research.In view of the rich semantic information contained in chapter content and the particularity of Chinese academic literature,this paper proposes a method for structure function recognition based on chapter content,which integrates features such as characters,words and radicals.This method introduces the Bi LSTM model and attention mechanism to perform deep semantic extraction of the characters,words and radical features of the chapter content in the academic literature.The chapter content of 750 academic literatures in the field of library and information science selected from CNKI are treated as the experimental corpus,and the proposed method is compared with the current mainstream methods.The experimental results show that the precision,recall and F1 score of the proposed method are respectively reaching 0.75,0.74,and 0.74,surpassing all comparative methods,proving the effectiveness and superiority of the proposed method.In addition,this paper introduces a confusion matrix to analyze the error identification results and causes of the proposed method.
Keywords/Search Tags:Deep Learning, Recognition of Structure Function, Academic Literature, Research Level, Research Field
PDF Full Text Request
Related items