Font Size: a A A

Research On ERE Algorithm Adapted To Few Samples

Posted on:2023-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y PengFull Text:PDF
GTID:2558307097985459Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Due to the booming development of network information technology,the information resources on the Internet are expanding day by day,from academic literature to social media,from specific fields to open fields,and the data from various sources are growing exponentially.Therefore,how to extract the information we need from the huge amount of unstructured data is the problem to be solved by information extraction technology.As a key branch of information extraction technology,entity relationship extraction technology has important research significance.For the entity relationship extraction task,most solutions use supervised training.However,this approach requires large-scale fine-labeled data,which is very labor-intensive.Although the remote supervised method can solve the problem of time-consuming and labor-intensive manual labeling,the method introduces large-scale noisy data and also has difficulty in solving the long-tail problem of sample distribution in general.Therefore,how to perform entity relationship extraction with only a very small number of samples is an important topic in the current research of entity relationship extraction tasks.Facing the above challenges,this paper discusses both data and models.(1)A distribution calibration-based model for small-sample entity relationship extraction is proposed from the perspective of data augmentation.The model assumes that each dimension in the sample feature representation follows a Gaussian distribution,and the well-sampled categories tend to have better distribution statistics information,i.e.,mean and variance.Therefore,the distribution statistics of the well-sampled category can be borrowed to calibrate the distribution statistics of the small-sampled category,so that the skewed data distribution is closer to the real data distribution.From the calibrated data distribution,a sufficient number of samples are used to train the classifier to classify the data,thus achieving the purpose of data enhancement.(2)From a modeling perspective,an inductive network is proposed.The model generalizes at the level of categories,i.e.,reconstructs the hierarchical representation of the support set,and uses dynamic routing algorithms to generalize the sample representation to the category representation instead of using simple summation or averaging for the samples;the model also considers that the metric between support set samples and query samples is also a more important part of the small-sample entity relationship extraction task,and by using a relationship module constructed by neural networks instead of The model performance is improved by using the relationship module constructed by neural network instead of simple cosine distance or Euclidean distance metric function,which can measure the similarity between query samples and support set samples more accurately.In this paper,we study the problems faced by the small sample entity relationship extraction task and propose improvement methods from different perspectives.The experimental aspect compares the model proposed in this paper with the baseline model on a widely used dataset for this task and demonstrates the effectiveness of the approach in this paper.
Keywords/Search Tags:Entity relationship extraction, Few-shot learning, Class feature representation, Distribution calibration, Inductive network
PDF Full Text Request
Related items