Research And Implementations Of Text Classification Based On Semantic Enhancing

Posted on:2022-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:X Z Tang

Full Text:PDF

GTID:2518306572487474

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

The rapid development of natural language processing(NLP)methods fastens the research progress of text classification,which includes manual classifiers,supervised classifiers,semi-supervised classifiers,and unsupervised classifiers.We focus on three issues in text classifications.Firstly,there exists lots of polysemous words in input texts,which makes it hard to capture accurate sematic with traditional contextual methods.Secondly,as a special information carrier,there are several aspects of Chinese imply key semantic,including pinyin for pronunciation,wubi for structure,radicals for components.For the polysemous issue in text,we present a novel way injecting factual knowledge into BERT model.we employ open-source knowledge base to query the adjacent neighbors of entities and use them as potential meanings and select the one of them which gets the highest consine score with the average vector of the text.Finally,we conduct our model on QQP,SQu AD,etc.The experimental results that our model outperforms the previous models on SQu AD and NER.For the multiple expressions of text semantics in Chinese,we proposed a semantic fusion framework based on multiple granularities.Firstly,we use opensource analysis tools to generate the radical,pinyin and wubi sequences.Secondly,we proposed a novel classifier combining Chinese character,pinyin,wubi and radical expressions.We conduct our model on four widely used Chinese datasets,comparing with other SOTA methods in detail,including LTSM,BERT and its successors.The experimental results indicate that the fusion of multi-granularity model architecture outperforms other normal classifiers in Chinese text classification.For the importance of entities to the semantic of a text,we proposed an entityaware framework.To be intuitive,entities play an important role in the sematic of a sentence and the relationships among them can be organized as non-Euclidean graph structure,so we proposed an entity aware GCN to encode the entity information into the prediction model to improve the effectiveness of the text classifier.Finally,the experimental results show that the entity-aware proposed here performs the ordinary text classification methods.

Keywords/Search Tags:

Semantic Correction, Text Classification, Entity-Aware Encoding

PDF Full Text Request

Related items

1	Research On Short Text Classification Based Upon Convolution Feature Encoding And Attention Mechanism
2	Construction Of Web Community Text Entity Relations Map Based On Semantic Elements
3	Research And Application On Semantic Relateness Based On AF Model
4	Algorithm Research On Text Classification And Named Entity Recognition Based On Deep Text Feature Representation
5	Text Correction For ASR Result On The Platform Of Intelligent Mobile Phone
6	Research On Ontology-Based Semantic Text Categorization
7	Research On Label-aware Text Classification Methods
8	Research On Chinese Text Classification Based On Semantic Analysis
9	Research On Named Entity Recognition And Disambiguation Based On Network Semantic Resource
10	Image Correction And Text Recognition For Deformation Label Of Storage Package