Along with the development of big data era,more and more entities,different entity mentions and more detailed entity types appear in the Internet.However,the named entity recognition task can only assign coarse-grained entity type labels to entity mentions.Therefore,how to assign fine-grained entity type labels according to different entity mentions has become a hot research issue.And the fine-grained entity typing emerges as the times require.Given entity mention and its context,the fine-grained entity typing task assigns one or more type labels to it,which usually has a certain hierarchical structure among the types.Fine-grained type labels provide more semantic information for entity mention,which can lay a good foundation for relation extraction,event extraction and other tasks.Also,it is more helpful to improve the accuracy,work efficiency and credibility of natural language processing tasks such as downstream question answering systems and knowledge base improvement.At present,most fine-grained entity typing methods use distant supervision to extract the type label information related to entity mention from the knowledge bases containing rich information,and return all type labels in the knowledge bases to entity mention,which may introduce noisy labels.In order to reduce the negative impact of noisy labels on classification results and improve the accuracy of fine-grained entity typing method.This thesis studies the problem of noisy labels in fine-grained entity typing task.The main research contents are as follows:(1)From the perspective of model enhancement,the memory network model is used to jointly learn entity mention context and type labels in order to establish the correlation between entity mention context and type labels.And the important information of entity mention context and type labels is fully extracted,to provide indicative information for assigning type labels consistent with context semantics.Secondly,to alleviate the negative influence of noisy labels,the deformed hinge loss function and the deformed hierarchical loss function can be used to effectively reduce the negative impact of noisy labels on the overall performance of the fine-grained typing model by adjusting different parameter weights.At the same time,the class-balanced loss function is introduced to dynamically adjust the weight of different label loss values to further optimize the loss function.Finally,the L2 regularization function is used to effectively avoid overfitting of the noisy labels in the fine-grained entity typing model,so as to improve the overall performance of the fine-grained entity typing model.(2)From the perspective of data enhancement,the dataset is divided into clean set and noisy set according to the number of type labels corresponding to entity mentions.And different loss functions are constructed for the two training datasets to train fine-grained entity typing model.Since there are relational features between entity mentions and type labels,as well as similarity and hierarchical features between type labels and type labels.The feature generator can be used to extract effective entity mention-type label features and type label-type label features,to assist the association features between entity mention context and type labels,and construct complete features to assign more semantic type labels to entity mentions.Finally,adversarial training is introduced into the embedding layer of context processing as a regularization method,which can not only effectively avoid the fine-grained entity typing model overfitting noisy labels,but also help to improve the robustness of fine-grained entity typing model dealing with noisy labels. |