Font Size: a A A

Application Of Transfer Learning In Health Information Text Processing

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z X YuFull Text:PDF
GTID:2428330602480213Subject:Social Medicine and Health Management
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,network information has grown exponentially.Due to the lack of strong supervision and restraint,various rumors have been generated and spread rapidly,especially the wanton transmission of various false health information,which is not conducive to people's health,Social stability and national development.Text classification is a sub-task of natural language processing,but the current research on text recognition of false health information is very scarce.Previous research has focused on identifying rumors on Weibo,fake product reviews,spam,and fake news.Different from rumors,false health information on the Internet contains a lot of medical terminology,and most of the text presents the characteristics of both truthfulness and falsehood.These characteristics make large and effective labeled data very difficult to obtain,and labeled samples are very time-consuming.In recent years,deep learning methods have been used to solve related problems and achieved good results,but they require a large amount of efficient labeled data,which also limits the deep learning model to solve this problem to a certain extent.With the emergence of transfer learning models,the rise of pre-trained models and fine-tuning models has reduced the need for labeled data and is more conducive to solving such problems.The main purpose of this subject is to mine the characteristics contained in the network health information,and propose a new method for detecting false health information.The medical health information text on the rumor encyclopedia of Guoke.com is used as data.It performs modeling,uses deep learning models and transfer learning models to make classification predictions,and determines whether it is false health information.Therefore,the main tasks are as follows:1)In this paper,two text vectorization methods,the Word2vec model and the BERT model,are used to facilitate the feature extraction of subsequent models.The BERT model is also used to train the universal language model.The knowledge in the universal language model is applied to medical information classification tasks through transfer learning.2)This paper proposes a neural network model based on Word2vec model and BERT model,are used to train word vectors to replace the embedding layer of the neural network model.At the same time,a combination of multiple models is used to improve the efficiency of the classification model.3)This paper proposes a transfer language learning model based on a universal language model.By acquiring relevant data that is not labeled on the network,it is used as a data set for training the universal language model.And it is trained to obtain a universal language model by using the BERT model.The universal language model is used as a knowledge base,and the BERT model and a simple neural network model are used to transfer the knowledge learned from it to medical information classification tasks to improve the accuracy and efficiency of medical information classification.
Keywords/Search Tags:network health information, text vectorization, transfer learning model, neural network model, pre-training
PDF Full Text Request
Related items