Font Size: a A A

Research On Suggested Sentence Recognition And Suggested Information Extraction

Posted on:2021-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:H J WangFull Text:PDF
GTID:2518306107493574Subject:Engineering
Abstract/Summary:PDF Full Text Request
The suggestion is a complex language phenomenon widely existing in natural language processing,which often contains rich and useful information.Mining suggestions in the text,automatically identifying them and extracting key information greatly improves the value of the information obtained.At present,it is recommended to dig out references that can help the company's product or service improvement in the industrial environment and the user's product or service experience reference.It has become a new research hot spot in the field of natural language processing.All in all,the proposed excavation is of great significance to both industry and academia.At present,it is recommended to explore areas that are still less explored due to the limitation of corpus.In related research,it is usually defined as the task of suggesting sentence recognition.In view of the more fine-grained information to be mined in the suggested corpus,this thesis proposes for the first time to divide the suggested mining into two-stage tasks.The first stage is the original task of sentence recognition in the text(sentence classification),and the second stage is the expansion task of recommendation information extraction(sequence labeling).The main research content includes the following two aspects:(1)A hybrid model based on BERT and bidirectional long-short-term memory network and capsule network combined with attention mechanism is proposed to recognize the proposed sentence of English corpus.This model can overcome the shortcomings of convolutional neural networks that cannot extract deep information such as phrase semantics and position,and the shortcomings of long-distance dependence of bidirectional long-term and short-term memory networks.It incorporates BERT pre-trained target corpus to represent more powerful word embedding.Experiments show that on the 13 th International Semantic Evaluation Corpus,the proposed sentence recognition results based on this model have been greatly improved,and it has a certain generalization ability between cross-domain texts.(2)In view of the lack of the second-stage task corpus,this thesis marks the new corpus independently.A bidirectional long-short-term memory network based on BERT and CNN double embedding combined with a hybrid semi-Markov conditional random field model is proposed to extract the recommended information from the annotated English corpus.The model can overcome the shortcomings of conditional random fields that rely heavily on manual feature extraction,integrate double-embedding of context and character-level features,and can effectively utilize past and future input features and sentence-level tagging information.Experiments show that: in the annotation of English corpus,the recommendation information extraction based on this model has obtained good experimental results.
Keywords/Search Tags:Suggestion Mining, BERT, BiLSTM-Capsule, BiLSTM-HSCR, Attention Mechanism
PDF Full Text Request
Related items