Research On The Application Of Entity Recognition In The Question Answering System In The Field Of “Four Insurances And One Fund”

Posted on:2023-08-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y P Sha

Full Text:PDF

GTID:2558306905486934

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Named entity recognition aims to identify entities with specific meanings in the text,including names of people,names of places,names of organizations,proper nouns,etc.Named entity recognition plays an important role in many downstream natural language processing tasks.Since words are the basic unit of English,the input of English entity recognition is a sequence of words.Unlike English entity recognition,Chinese entity recognition methods mostly accept characters as input,so the methods of English entity recognition and the methods of Chinese recognition are also quite different.However,named recognition methods which accept characters sequence as input cannot utilize lexicon information which can help the model to identify entities more correctly.Therefore,how to make full use of lexicon information in character-based entity recognition methods has become a research hot point in Chinese entity recognition methods in recent years.“Four insurances and One Fund” is a very important social insurance system in our country,but there are few researches on natural language processing applications in this field.Therefore,in order to meet the needs of the people for acquiring knowledge in the field of“Four insurances and One Fund”,a field knowledge base of policies and regulations will be constructed,as well as based on this knowledge base,it is very important to construct a question answering system in the field of “Four insurances and One Fund”.The main contributions of this paper are as follows:(1)By studying the characteristics of the texts of policies and regulations in the field of“Four insurances and One Fund”,a part-of-speech combination rule is proposed to quickly filter out terminology in the field of “Four insurances and One Fund”.The terminology which is filtered out is used as a custom dictionary for the word segmentation tool to segment the text of policies and regulations,and then combined with manual labeling to construct a machine learning entity recognition dataset in the field of “Four insurances and One Fund”.(2)Based on the problem that character-level entity recognition cannot utilize lexicon information and the characteristics of long terminology in the texts in the field of “Four insurances and One Fund”,an entity recognition method that combines characters and words is proposed.This method uses word frequency,improved mutual information,and backward forward word frequency as lexicon information,which can better assign weights to all words matched by each character,dynamically adjusts the weight of each part of the lexicon information according to the semantics by introducing attention mechanism.Experimental results on three datasets show that,compared with some existing methods,the entity recognition method proposed in this paper can achieve a higher F1 value.(3)Through the entity recognition method proposed in the paper,field terminologies are identified from the text of policies and regulations in the field of “Four insurances and One Fund”,and the field dictionary of “Four insurances and One Fund” is constructed through web crawler.And then based on the field dictionary and the field entity recognition model,a field question answering system of the “Four insurances and One Fund” that can answer conceptual questions is constructed.

Keywords/Search Tags:

Named entity recognition, Lexicon information, “Four insurances and One Fund”, Field dictionary, Field question answering system

PDF Full Text Request

Related items

1	Research On Relation Extraction And Its Application In Question Answering System In The Field Of "Four Insurances And One Housing Fund"
2	Research On Machine Reading Comprehension And Its Application In Question Answering System In The Field Of "Four Insurances And One Housing Fund"
3	Text Classification Method And Its Application In The Field Of Four Insurances And One Housing Fund
4	Research On "Four Insurances And One Housing Fund" Question Answering System And Question Intent Classification Technology
5	Research On Entity Recognition And Intent Analysis Method In Medical Field Based On A Lite Bert
6	Research On Internet Based Chinese Question-Answering System
7	Research And System Construction Of Named Entity Recognition Algorithm Based On Deep Learning
8	Question Answering System Based On Web Search
9	Research On The Key Technology Of Named Entity Recognition And Relation Extraction In Military Field
10	Named Entity Recognition Based On Conditional Random Fields Chinese Research