Font Size: a A A

Research And Implementation Of Chinese Address Extraction Based On Distillation Neural Network

Posted on:2024-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y T WangFull Text:PDF
GTID:2568306944459964Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Fine-grained address entity recognition from spoken dialogue is an important but challenging task.There are multiple types of address entities distributed across the multi-round dialogue context,and extracting addresses from dialogues is one of the most important basic tasks of location-based services,such as data analysis and corresponding recommendation.Existing work typically formulates this problem as a fine-grained named entity recognition task,which in our scenario suffers from a high cost of training data annotation.Luckily,large-scale full standard addresses could be easily crawled from the web pages like Google Maps and annotated with fine-grained address tags with limited human effort.After processing the data leveraging from crawling and the address segmentation task,the problem of less labeled samples can be effectively solved and the model can be trained better.In order to further improve the performance of the model,this paper proposes a knowledge distillation method.Through transfer learning,the model can learn more extensive knowledge including address information.On the basis of this,this paper further designs a dynamic visual map system for displaying the goal of address extraction task.The main work of this paper is as follows:1.Aiming at the lack of labeled dialogue data set containing address entities,this paper proposes a data augmentation paradigm for multi-turn dialogues with labeled address entities.Through this data augmentation paradigm,this paper further constructs a labeled spoken dialogue dataset,which effectively solves the problem of few labeled samples and is used for the training of the model this work proposed.2.This paper brings up a knowledge distillation approach to construct the neural network model based on the thought of transferring learning,to further improve the results of fine-grained address entity recognition task,and the generalization ability of the model.3.Based on the above-mentioned methods,this paper applies the proposed algorithm to construct an address extraction visualization system.The address extraction system could accomplish the extraction of address entities in the dialogue scenarios,and visually displays the full standard address on the interactively dynamic map.The system is tested,and the test results show that the system can effectively extract the address in the dialogues and realize the visualization function.
Keywords/Search Tags:address extraction, named entity recognition, knowledge distillation
PDF Full Text Request
Related items