Font Size: a A A

Research Of Entity Recognition And Entity Linking Based On Deep Learning

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:J Y DingFull Text:PDF
GTID:2428330632462918Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the data on the Internet is showing explosive growth at an extraordinary speed.In particular,the explosive growth of text data has brought a serious "information overload" problem.Massive redundant information,fake information,and noise information in Internet make it more difficult to find and browse useful information.People urgently need some automated technology to help process massive data,and automatically extract less noise but more informative information from the massive data on the Internet.Related technologies at the current stage include information extraction and abstract summarization.As a basic technology in the field of information extraction,entity recognition and entity linking can identify key entities from cluttered text,structure the text into an entity-centric semantic representation,and provide an efficient and convenient analysis technology for a large number of unstructured text.Entity recognition is a task to identify certain types of entities in the text.Most of the existing entity recognition tasks are to identify people,places,organizations,etc.,but there are few studies on other types of entity recognition tasks,especially for percentage numbers related entity recognition research.Meanwhile,there are unique problems and challenges for percentage-related entity recognition tasks.For example,the entity itself has problems such as incompleteness,which cannot be solved by traditional entity recognition methods.And this task is of great significance to help the machine to understand the meaning of the percentage number and its corresponding technology can be used for text analysis to generate a visual chart of the percentage,or assist intelligent question answering.Entity Linking(EL)aims to automatically resolve mentions of entities in a document to the entities in the real world it represents.State-of-the-art EL methods typically utilize local contextual information for obtaining mention embeddings which will be compared to candidate entity embeddings and then apply Conditional Random Field(CRF)for collective EL,considering global coherence.An inherent drawback of these methods is that,the global semantic relationships among the candidate entities in the same document are not encoded in the embedding process.As such,the resultant embeddings may not be sufficient to capture the global coherence effect.Based on the above,this paper has carried out the following two research works.Rewriting-based Quantitative Entity Extraction:In this paper,we study the problems and challenges of quantitative entity extraction for text with percentages.For example,the entity is composed of multiple discontinuous text spans.To solve these problems,we propose rewriting based method which uses existing labeled data to train a sentence rewriting model and rewrites complex sentences containing multiple percentages into multiple relatively simple clauses.It will alleviate the proposing problems.Meanwhile,we use reinforcement learning to jointly optimize the sentence rewrite model and the entity extraction model.Experimental results show that our approach outperforms competitive baselines with promising results.Graph Convolutional Network based Entity Linking:In this paper,in order to solve the insufficient of existing methods,we propose a novel end-to-end graph neural entity linking model.A heterogeneous entity-word graph is first constructed for a document,which encodes the relationships between entities in the document.Then a graph convolutional network(GCN)is applied to the entity-word graph,dynamically generating a new set of entity embeddings which are enhanced with semantic information from related entities and words.These dynamic entity embeddings empower the model with augmented global semantic coherence.On top of the GCN,a conditional random field(CRF)is adopted to combine local and global information for collective entity linking.Extensive experiments have demonstrated the efficiency and effectiveness of our method over a few state-of-the-art EL methods.
Keywords/Search Tags:Entity Recognition, Sentence Rewriting, Entity Linking, Graph Convolutional Network
PDF Full Text Request
Related items