Font Size: a A A

A Comparative Study Of Cross-disciplinary Multi-label Text Classification Methods

Posted on:2022-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2558307133488124Subject:Library and Information Science
Abstract/Summary:PDF Full Text Request
Background-The rapid development of science and technology has brought many scientific research results,and the sharing of information has become more timely and efficient.This is followed by an explosive growth of information.On the one hand,with the rapid development of Internet technology and the advent of the era of big data,the total amount of digitized texts has also grown rapidly;on the other hand,there has been cross-convergence between subject research fields,and academic researchers also need to understand the subject of academic papers.Cross-case to improve the efficiency of academic research,so people need automated tools to help people classify documents with multiple labels.At the same time,the National Natural Science Foundation of China has increasingly encouraged scholars to conduct interdisciplinary research.Therefore,cross-disciplinary multi-label classification research is very necessary.Purpose-The purpose of this paper is to use machine learning technology to select document abstract data to solve the problem of interdisciplinary multi-label text classification through text classification.Taking library and information science as an example,this article hopes to obtain an intelligent multi-label text classification processing method suitable for the three intersecting fields of library and information science and management,mathematics and computer science,so as to help academic researchers in the field of library and information science better and better.Quickly judge the interdisciplinary field of the literature of this subject,and explore the interdisciplinary overview of the three disciplines of library and information science and management,mathematics and computer science,and provide references for future interdisciplinary research and innovation directions of library and information science.Methods-This article has learned the basic process of text classification,machine learning classification algorithms and deep learning related models,and used multi-label text classification methods based on machine learning and deep learning,and judged the literature data of library and information science based on the abstract data of the paper.What are the subjects that are intersected among the three subjects of management,mathematics and computer science? At the same time,three different multi-label text classification models of LDA-SVM,Text CNN and Bi-LSTM were selected for comparison experiments,and the accuracy,accuracy,recall and F1 value of the three types of models were compared to determine the results The best interdisciplinary multi-label classification model.Results-After comparing the experimental results of the three types of models by using part of the data and the full amount of data,it is found that the machine learning method combined with the LDA topic model and the support vector machine(SVM)classification model has good stability while ensuring the quality of classification.It can better solve the problem of interdisciplinary and multi-class classification of papers based on abstract data.After determining the optimal multi-label text classification model,the abstract data of the 2020 core journals of library and information science is obtained,and then the multi-label text classification model based on LDA-SVM is used to classify the data.The study found that in the interdisciplinary relationship of management,mathematics and computer science,library and information science is more closely intersected with management,followed by computer science.The inter-relationship between library and information science and mathematics is the most sparse.Scholars will innovate in the future.The research direction can choose the cross-topic that combines mathematics and computer science.
Keywords/Search Tags:Interdisciplinary, Multi-label classification, Text classification, Library and Information Science
PDF Full Text Request
Related items