| In natural language processing,information extraction is always a fundamental and critical task.With the accelerated informatization of society,it is of increasing practical importance to extract structured knowledge from large-scale text data that can be easily understood and processed by computers.It can not only be used to construct and update knowledge graphs,but also provide necessary information support for numerous downstream tasks such as search engines,question and answer systems,and recommendation systems.In terms of increasing the scale of information extraction and reducing labor costs,automated approaches to extracting information using deep learning models have achieved remarkable results.However,these information extraction models face the unavoidable problems of relying on a large number of training samples and being unable to effectively deal with data with long-tailed distribution,which limit the overall performance of the models and further reduce the performance of the models on categories with sparse sample.Therefore,this paper focuses on the information extraction problem and adopts a few-shot learning approach to address the challenge faced by current information extraction models,i.e.,how to maintain good performance with a limited number of training samples.Among the mainstream few-shot models,the prototypical network is a representative small-sample learning method with its simple and efficient characteristics.In this paper,we analyze and study the problems faced by the prototypical network on different information extraction tasks and propose corresponding solutions to improve the performance of the model on few-shot information extraction tasks.The details of the study are as follows:1.On the few-shot entity recognition task,in response to the difficulty of the prototypical network in coping with the diversity of character expressions and the semantic ambiguity of entities brought by non-predefined categories(which contain all entity categories other than predefined categories),this paper proposes a character-aware and sentence-aware few-shot entity recognition model that facilitates more accurate localization of entity categories.Character-awareness uses the interconnection between characters to filter out the characters with higher influence on entity classification from the many characters under each entity class as reference samples,and sentence-awareness uses the interconnection between sentences and entity classes to eliminate the influence of non-defined entity classes on entity prototype computation.In addition,character-awareness and sentence-awareness provide different judgment bases for the model to perform entity recognition,respectively,and the union between them also allows the model to better adapt to changes in the granularity of entity classes and improves the robustness of the model.2.On the few-shot relation extraction task,the prototypical network is targeted at the difficulty of coping with diverse text representations and the problem of feature sparsity and high variance caused by limited samples.In this paper,we design three methods,deep integration,feature-level attention and fine-tuning,in combination with the characteristics of the prototypical network to help the model better classify the relations.In the integration scheme,this paper considers the model structure aspect and selects four high-precision neural networks as encoders for different submodels,and then combines them with two metric approaches respectively.Compared with the previous integration strategy of simply copying submodels,this idea of considering submodel differences into the integrated network greatly improves the integration effect and enhances the semantic feature representation and similarity metrics of the prototypical network.In the feature-level attention scheme,this paper uses the connection between samples to find features that have more influence on relational classification,which enables the model to avoid the interference of invalid features and locate relational categories more accurately.In addition,in order to utilize the limited reference samples more efficiently,the encoder is further fine-tuned based on the similarity metric so that the model can be better adapted to new relational categories. |