Research On Name Recognition Technology Of Bidding Project Based On Deep Learning

Posted on:2021-02-22

Degree:Master

Type:Thesis

Country:China

Candidate:K Zhang

Full Text:PDF

GTID:2428330614961609

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The Internet provides a large number of data sources,and most of them exist in the form of text.How to make full use of these text data faces many challenges.Tender announcements are such data,which are widely available on government procurement websites at all levels in China.A tender announcement usually consists of a title and a text.Although the title describes the name of the project to be tendered,it also contains many other auxiliary contents such as the project unit and the project location.Therefore,in the face of tens of thousands of new data records every day,identifying and extracting more concise project names is helpful for improving the ability of data query and data analysis.Deep learning is an effective method for processing text data.For the diversity of the titles of bidding announcements,we chose a Transformer-based model for feature extraction,and proposed a Transformer-att-label model with joint labeling.In the feature extraction,a traditional attention mechanism is used to combine multiple attention heads.This will make the model able to give more attention to important information and improve the model effect.The Transformer calculates the probability that a word belongs to each label,combined with label embedding,the first N probabilities of possible word labels are added as the weight of the label vector,and used as the semantic vector of the predicted label of the word.Calculate the distance between the semantic vector and the label vector,and select the closest label output.Further,in view of the problem of polysemy and lack of training data in the title of the bid announcement,this paper proposes the Bert-Bi Lstm-label-CRF model.Bert uses MASK technology to better determine semantics based on contextual context based on the two-way Transformer model,And the model is pre-trained based on huge data,which can achieve better results when there is less training data.We use the Bert model to train word vectors,and extract features based on the characteristics of the NER task plus the Bi Lstm model,and finally use the method of label joint labeling to do sequence labeling under the restrictions of CRF to improve the effect of project name recognition.We conducted experiments on the proposed model on the announcement title data set of the Chinese bidding website,and compared with the recognition effects of other mainstream models to verify the effectiveness of the method.

Keywords/Search Tags:

Transformer, label, Bert, BiLstm, CRF

PDF Full Text Request

Related items

1	Research On Sentiment Analysis Based On BERT-BiLSTM Adversarial Training
2	Research On Suggested Sentence Recognition And Suggested Information Extraction
3	Research On Chinese Text Sentiment Analysis Based On Transformer And BERT Model
4	Research On Automatic Question Answering Technology Based On Transformer
5	Application Of Bert In Chinese Company Name Recognition
6	Algorithm And Application Of Text Classification Based On Transformer
7	Research On E-business Review Sentiment Analysis Algorithm Based On Deep Learning
8	Research And Improvement Of Text Similarity Calculation Method
9	Algorithm Research For Chinese Text Multi-label Classification
10	Research On The Construction Of Patent Knowledge Graph Based On Natural Language Processing