Unsupervised Relation Extraction Based On Matrix Factorization

Posted on:2019-03-22

Degree:Master

Type:Thesis

Country:China

Candidate:J M Huang

Full Text:PDF

GTID:2428330545486958

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Relation extraction is one of the fundamental tasks of information retrieval.Traditional supervised and semi-supervised methods in this area require labelled training datum or existing knowledge base,which limits their employment in a new area without any prior information.While unsupervised methods deem the task as a cluster problem,they can extract new relation instances from a corpus based on the context information.Yet previous unsupervised models yield poor performance with a complex structure,due to the high dimensional and sparsity of the co-occurrence matrix of entity pairs and relation mentions.Additionally,they introduce the discrete feature vector generated from hand-crafted feature sets to embed semantic information of relation mentions,which is also high-dimensional and sparsity.It increases not only the complexity of the model but also the sparsity of the co-occurrence matrix.Therefore,we propose a new unsupervised relation extraction model from the perspective of matrix factorization.Our model aims to reduce the complexity of the process and incorporate new semantic information,resulting in better training efficiency,flexibility and performance.The whole method consists of three parts:Firstly,we propose a co-occurrence matrix factorization method using negative sampling.We learn the representation of entity pairs in relation space,with negative sampling to reduce the complexity of the method and fully leverage the limited information.Secondly,we propose a multi-layer matrix factorization method to introduce deep semantic reinforcement.With a multi-layer matrix decomposition model,we manage to avoid the additional complexity of the introduced representations.Besides,the employed embeddings of relation mentions generated with word embeddings is a low-dimensional dense vector with no noise from downstream NLP tools.Lastly,we propose a novel neural relation extraction model named NURE-DSE,which combines two proposed models and trains itself with back propagation.The model benefits from both methods,and calculates the update gradient of parameters automatically.It is simple,efficient and high flexibility,which fits the corpus in web-scale.Experimental results in NYT10 dataset demonstrate the effectiveness of our method.It outperforms existing methods in F1 score and yields expressive embeddings of entity pairs in relation space.

Keywords/Search Tags:

unsupervised relation extraction, representation learning, negative sampling, word embedding, matrix factorization

PDF Full Text Request

Related items

1	The Study Of Relation Extraction And Knowledge Graph Based On Representation Learning
2	Relation Extraction Based On Word Embedding And Deep Learning In Biomedical Texts
3	Using Distant Supervision And Representation Learning For Entity Relation Extraction
4	Research On Speech Enhancement Algorithm Based On Non-Negative Matrix Factorization
5	Research On Image Recognition Based On Semi-supervised Non-negative Matrix Factorization
6	Eeg Feature Extraction Based On Non-negative Matrix Factorization
7	Research On Feature Extraction Method Based On Multi-manifold Learning In Face Recognition
8	The Research Of Incomplete Multi-View Clustering Algorithm Based On Matrix Factorization
9	Research Of Chinese Personal Social Relation Extraction Based On News Data
10	Research And Application Of The Word Embedding Method