A Siamese Recurrent Neural Network For Entity Alignment

Posted on:2019-07-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Lv

Full Text:PDF

GTID:2348330545477895

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Recently,with the rapid development of the modern information technology,data generated by human shows explosive growth.Due to the lack of unified data specifi-cations and system or human error in the process of data collection,storage,and use,there is a large amount of data inconsistencies and redundancy in these mass data.The existence of these problems makes the conclusions based on these data likely to go wrong or even contradictory.Therefore improving data quality has become a hot issue in current information science research.Entity alignment which deals with determining whether two records refer to the same entity has a wide range of applications in both data cleaning and integration.By using entity alignment,we can handle inconsistencies in data sets.Traditional ap-proaches focus on using string metric methods to calculate the matching scores of two records or employing a conventional machine learning technique with manually ex-tracted features from pairs of records.However,the effectiveness of these methods largely depends on designing good domain-specific string metrics or manually extract-ing discriminative features.Also,traditional learning-based methods often ignore con-textual semantic information in text data when constructing features.In this thesis,we study the application of a recurrent neural network to entity align-ment and propose two basic entity alignment methods which are based on siamese re-current neural networks,Word-based MaLSTM and Character-based biLSTM.Both of the two methods implement an end-to-end deep network model,which apply recurrent neural network to automatically capture contextual semantic features from data and do not need any string metrics.Considering the problem of information gaps between attribute fields in data(for example,citation data),we further propose a new entity alignment method based on joint multi-field siamese recurrent neural network,JMFS RNN.According to each attribute field's text characteristics,JMFS RNN uses differ-ent recurrent neural network cells to capture each field's features and combines all of these captured features.The special processing method can not only effectively mine the features of each attribute field,but also avoid the influence of the information gaps.We compare the three proposed methods with several traditional entity alignment methods in two public data sets,Cora and Citeseer.The experimental results show that our methods can effectively learn discriminative features and outperform other traditional methods.

Keywords/Search Tags:

Data Inconsistency, Data Quality, Entity Alignment, Siamese Recurrent Neural Network

PDF Full Text Request

Related items

1	Robust Machine Learning Algorithms For Data Quality Management
2	Research On Cross-network Entity Alignment Based On Multi-source Interaction Fusion
3	Research On Object Tracking Algorithms Steered By Recurrent And Siamese Neural Network
4	Research On Named Entity Recognition Of Chinese Image Reports Based On Recurrent Neural Networks
5	Key Problem Research Of Data Quality In Big Data
6	Research On Algorithms Of Big Data's Consistency Quality Analysis
7	Similar Text Discrimination Based On Siamese Network
8	Video Quality Evaluation Based On Recurrent Neural Network
9	Domain Named Entity Recognition Method Based On Recurrent Neural Network
10	Research On Entity Alignment Method For Linked Open Data