Font Size: a A A

Research On Automatic Classification Of Chinese Books Based On Pre-trained Models

Posted on:2024-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y T OuFull Text:PDF
GTID:2568307121983399Subject:Electronic information
Abstract/Summary:PDF Full Text Request
The library is the center of learning resources for universities,and the construction and service capabilities of library resources have an important impact on the development of disciplines in universities.In the wave of "Double First-Class"discipline construction,university libraries also shoulder important missions.In recent years,the development of new generation artificial intelligence technology has provided the possibility for highly automated management of library resources.How to use artificial intelligence technology to achieve scientific cataloging and automatic classification of Chinese books has become an important content for conducting research on smart libraries.However,due to the lack of high-quality domain datasets and insufficient application of technical means,research on Chinese book classification based on artificial intelligence technology is still in the exploratory stage.Therefore,this article focuses on the application research of automatic subject classification of Chinese books in university libraries,constructs a dataset for subject classification of Chinese books,and uses advanced natural language processing technology to achieve automatic classification of Chinese books,providing technical support for the automated management and intelligent,accurate knowledge services of future smart libraries.The research work of this article mainly includes:(1)Construction of a dataset for subject classification of Chinese books.Based on the inadequacies of traditional Chinese Library Classification(CLC)in the knowledge service of university disciplines,a mapping method from CLC to subject classification was established based on subject catalogs.Data cleaning and completion were performed on the collection,circulation,and ordering data of university libraries using Python programming,and a dataset for subject classification of Chinese books with five labels was constructed,including 109 complete sub-discipline entries and 52,773 data.(2)Automatic classification of Chinese books based on pre-training models.A Chinese book classification model was built using cutting-edge pre-training language models in the field of natural language processing.The performance of the model was compared with traditional neural network models of deep learning.Multiple sets of comparison experiments were conducted on public datasets and self-built datasets to verify the advantages of pre-training language models based on multi-head selfattention mechanisms in text representation and feature extraction capabilities.The effectiveness of the Chinese book classification model based on pre-training models was also demonstrated.(3)Fine-grained Chinese book classification based on pre-training model and feature fusion.Optimization of the parameters of pre-training models such as BERT was carried out for the Chinese book subject classification task.A pre-training model feature enhancement method,PLM-LCN,based on feature fusion was proposed on this basis,fully utilizing the characteristics of different types of networks to enhance the feature representation ability of pre-training models.Through ablation and comparison experiments with multiple benchmark models,the effectiveness of PLM-LCN in improving classification performance and its good model generalization ability were verified.(4)Design and Development of an Automated Classification System for Chinese Language Books in Universities.Based on the actual demands of university libraries,a Chinese language book classification model algorithm was proposed and utilized to design and develop an automated classification system for Chinese language books in universities.This system enables the self-classification of books based on their related content,while also supporting subject-based book management,book retrieval,and book recommendation.As a result,the system provides technological support for the precise disciplinary knowledge services offered by university libraries.
Keywords/Search Tags:text classification, book classification, pre-trained model, deep learning, smart library
PDF Full Text Request
Related items