Paraphrase processing is an important means to deal with the synonymous phenomenon in natural language processing.Its performance depends on the paraphrase knowledge construction,among which paraphrasing templates has been receiving more attention.At the same time,paraphrasing template is indispensable in tasks such as information retrieval and automatic question and answering.The aim of paraphrasing template acquisition is to mine templates with paraphrasing relationships from large-scale raw texts.Open domain oriented paraphrasing templates acquisition faces two problems.One is that there is no semantic equivalence boundaries like paraphrasing parallel sentence pairs;the other is the lack of high quality semantic representation of templates.To address the above difficulties,we deeply investigate the knowledge base based open domain paraphrasing template acquisition method and the deep neural network based template semantic representation learning method.Our work are summarized as follows.(1)We propose an open domain paraphrasing template acquisition method that incorporates external feature from knowledge base.The existing methods require manual definition of semantic relations among entities and also have the problem of semantic bias.In this work,considering that the knowledge base contains large-scale relational triples,we propose an acquisition method including relational instance acquisition,template generalization algorithm and automatic clustering algorithm.The experimental results show that much more refined paraphrasing templates are obtained and the method is effective and can be extended to any semantic relation in the knowledge base.(2)We propose a template representation method based on deep neural networks.We conduct an in-depth study of template semantic representation which is the key point of work(1),including word representation based on contrastive learning,variable slot representation exploiting the pre-trained language model approach,and representation fusion strategy of template.The experimental results on the semantic public evaluation set show that the improvement in the Spearman correlation coefficient is large.Meanwhile,the paraphrasing template acquisition are also improved in the evaluation metrics BLEU,ROUGE-L,and Bert-Score respectively. |