Font Size: a A A

Research On Information Extraction Technology In Music Domain

Posted on:2020-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y HouFull Text:PDF
GTID:2428330590960697Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,the number of music on the network is increasing,and the corresponding music information is also growing rapidly.People have an increasingly urgent need to obtain music information quickly and accurately.Therefore,information extraction oriented to the music field has certain research significance.The main research content is named entity recognition and relationship extraction from music information described by natural language text.Information extraction in music domain is the first step of automatic construction of music knowledge graph,which can be widely used in the research of music domain such as information retrieval,recommendation system,question answering system and dialogue system.This paper first defines the music entity types and relation categories that need to be extracted,constructs the annotation corpus in the music field,and then on the basis of this,the related technologies of music entity recognition and relationship extraction are deeply studied.For the named entity recognition task in the music field,this paper firstly designs and implements the BLSTM-CRF model based on character feature as the benchmark model,and then uses three pre-trained character vectors to improve the embedding layer of the model:1.Use static character vector trained by Word2 vec on unlabeled corpus in the music field2.Use Pre-trained dynamic character vectors of the official BERT Chinese language model provided by google3.Use BLSTM language model train dynamic character vectors on unlabeled corpus in the music field,which are provided by FlairThe experiments show that the three improved models can improve the macro average F value by 5.1 %,9.57 % and 9.89 % respectively,which proves the effectiveness of the improved method.,which proves the effectiveness of our method proposed in this paper.For the relation extraction task in the music field,this paper uses the BLSTM-Attention model as the benchmark.In the input layer,the baseline model only considers the information of each word in the sequence and the location information of the entity,ignores the influence of the entity category on the relationship category,this paper proposes the entity position indicator feature with entity category to improve the input layer.In the attention layer,the attention mechanism in the baseline model only considers the importance of each word in the sequence,ignores the correlation between each word and two entities,this paper proposes to use the multihead attention mechanism to calculate the correlation between each word and the target entity in different feature spaces,so that the model can increase the weights of the words in the sentence that are strongly related to the entity.Experiments prove that the our proposed method achieves a better classification effect than the baseline model.
Keywords/Search Tags:Information Extraction, Entity Recognition, Relation Extraction, RNN, LSTM
PDF Full Text Request
Related items