Research On The Improvement Of Multi-label Text Classification Algorithm For Offensive Language In Social Media

Posted on:2024-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:B L Guo

Full Text:PDF

GTID:2568307067463574

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Due to the widespread use of social media and the wide difference in user quality,the phenomenon of offensive speech such as cyber bullying,gender antagonism,abuse and hate speech is very prominent,which seriously affects the physical and mental health of users,damages a healthy online environment and is not conducive to the construction of a harmonious society.Therefore,the precise classification of offensive texts in social media applications has become an urgent task.Diversified social media offensive texts have richer semantics and finer granularity.Traditional single-label text classification is difficult to meet the needs of accurate classification of offensive texts.In order to accurately classify offensive texts by using rich semantic information in a fine-grained manner,we improved the multi-label classification method for offensive speech from three aspects: sequential dependence,polysemy problem and miscommunication:(1)A multi-label classification method for offensive texts in social media based on joint embedding mechanism is proposed.This method uses the joint embedding mechanism to guide the model to pay more attention to the dependency between semantic information and tag features in text sequences,the dependency between tag features and keyword features,and the correlation between text semantic information and keyword information,so as to alleviate the dependence of text sequences and tag sequences on order.In addition,the multi-task learning guide model is used to carry out targeted learning according to different tasks to improve the generalization ability of the model.The experimental results show that the keyword information in the joint embedding mechanism can effectively improve the F1 score and accuracy of the model in the multi-tag classification task.(2)A multi-label classification method for offensive texts in social media based on mutual information fusion mechanism is proposed.This method uses the mutual information fusion mechanism to learn the higher-order semantic association between semantic information and label features in text sequences,improve the polysemy problem of the same word in different contexts,assist the model to learn deeper contextual semantic representation,and enhance the classification performance of the model.The experimental results show that the model exceeds the benchmark model in F1 scores,and the ablation experiment verifies that it can assist the model to make multi-label classification decisions by integrating high-order semantic association of semantic information and label information in text into the training process of the model.(3)A multi-label classification method for offensive language texts in social media based on attention enhancement mechanism is proposed.In this method,the expectation gate mechanism is used to guide the model to learn the discriminative feature information,reduce the negative influence caused by the error propagation problem,and enhance the classification accuracy of the model.The experimental results show that the model outperforms the benchmark model in a number of performance indicators,and the ablation experiments verify that the model can learn features more sensitive to classification decision by introducing the expectation gate mechanism in the training process.

Keywords/Search Tags:

multi-label text classification, multi-task learning, Condition label co-occurrence forecast, semantic-association, dependencies

PDF Full Text Request

Related items

1	Research On Multi-label Text Classification By Integrating Label Informatio
2	Web-page Classification Method Based On Multi-instance Multi-label
3	Multi-label Text Classifification Model Based On Correlation-guided Representation
4	Research On Multi-Label Text Categorization Based On Label Embedding Information
5	Research On Multi-label Classification Algorithms Based On Samples And Property Analysis
6	Research On Multi-label Classification Algorithm Based On Label Relationship
7	Research On Multi-Label Text Classification With Label-Dependency Information
8	Research On Text Classification Based On Interaction Between Text And Label Encoding
9	On Multi-label Text Classification Algorithms Based On Deep Learning
10	Research On Multi-Label Classification Algorithm Based On Co-Occurrence Relation