| Natural language processing(NLP)has been greatly developed with the techniques of artificial intelligence in recent years.Among plenty of NLP tasks,relation extraction attracts sufficient attention for its wide range of applications such as knowledge base completion and personalized recommendation.Relation extraction aims to identify relation facts for pairs of entities in raw sentences to construct relation triples like [Arthur Lee,place_born,Memphis],which is a fundamental work of many high-level NLP tasks.Nowadays,relation extraction has been well studied with lexical analysis,grammar analysis,and semantic analysis.However,current works still suffer from low accuracy of features clustering in complex scenarios and a high cost of human-labeled training data in large-scale relation extraction.The former problem is critical for the accuracy of relation extraction,and the latter is a big challenge for large-scale relation extraction.Extracting relation features precisely from a sentence is a vital issue for the task of relation extraction in which neural models are used effectively.Since neural models have been proved to be powerful feature extractors in amounts of applications such as image recognition and machine translation,a number of neural relation extractors were proposed and achieved impressive performance without hand-designed relation features.However,there still exist a few drawbacks in current neural models for relation extraction.For example,previous neural relation extractors cannot deal with multi-labeled relation extraction well due to the difficulty of distinguishing overlapped relation features in a single sentence.Besides,neural relation extractors are too costly to be applied to large-scale relation extraction applications.The complexity in both time and space is extremely high.To deal with all these drawbacks,in this thesis,we propose novel neural relation extractors.A new challenge for relation extraction in recent years is extracting large-scale relations without hand-labeled training data.The manually labeling way is too expensive to be used in constructing large-scale datasets.Therefore,recent works for large-scale relation extraction tendentiously utilize distant supervision to automatically construct training datasets with knowledge bases.However,this automatic way of constructing datasets is imprecise,and those datasets are full of noises such as wrongly labeled sentences.Training relation extractors with large-scale automatic datasets is challengeable because of all kinds of noises.In this thesis,we analyze different noises of the automatically constructed datasets and propose proper solutions.In summary,in this thesis,we focus on the above issues and propose practical solutions.First,we systematically study the task of neural relation extraction in four main aspects which are model accuracy,model efficiency,model robustness,and model frontier.We then optimize neural relation extraction models from the perspectives of accuracy and efficiency.Finally,to enhance noises immunity of relation extractors,we propose a four-level noises architecture in automatically constructed datasets for distantly supervised relation extraction.The four-level noises include word-level noise,sentence-level noise,prior-knowledge noise,and data-imbalance noise.To alleviate the influence of all kinds of noises,we propose four corresponding solutions respectively.Besides,a more robust neural relation extraction method is devised to resist multi-granularity noises by integrating multiple solutions for noises immunity.Specifically,we make the following contributions for relation extraction models in this thesis,1.Model Accuracy.We first focus on extracting overlapped relation features in a single sentence,which is named multi-labeled relation extraction.First,a capsule network based relation extractor is introduced to the task of multi-labeled relation extraction due to its strong ability of extracting overlapped features.Then,we integrate capsule network with a novel attention-based routing algorithm,which can enhance relation extraction.Finally,the proposed attentive capsule network is powerful in improving model accuracy.2.Model Efficiency.The efficiency of neural models is essential for the application of relation extraction especially for large-scale relation extraction.Current neural models such as convolutional neural network and recurrent neural network are overused and inefficient in large-scale relation extraction.To tackle this problem,we innovate an efficient and straightforward neural model which is question-answering based relation extractor.The new neural network significantly improves model efficiency.3.Model Robustness.We first model noises for distantly supervised relation extraction with a four-level architecture.The four levels are word-level noise,sentence-level noise,prior-knowledge noise,and data-imbalance noise.Each of the noises misleads relation extractors seriously,and all of the noises weaken the robustness of relation extraction models.In this thesis,we devise four novel solutions for four-level noises and propose a multi-granularity noises reduction method.4.Model Frontier.Finally,we explore advanced solutions for the task of relation extraction including a semi-distantly supervised method powered by generative adversarial nets(GAN)and an active-learning based evaluation method which can be seen as an unbiased metric for relation extraction. |