Font Size: a A A

Research On The Construction And Anal Sis Of Common Sense Corpora For Natural Language Generation

Posted on:2022-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:J HeFull Text:PDF
GTID:2558307154974729Subject:Engineering
Abstract/Summary:
While natural language generation is one of the important components of artificial intelligence,existing language generation tasks include text summarization,machine translation,dialogue generation,etc.It can be seen that existing generation tasks mainly study different domains of generation and different levels of generation,such as sentence-level,chapter-level generation,but the fundamental common sense error in machine language generation are usually ignored.However,lacking common sense might greatly affect the understanding of the texts generated by the systems.Although the issue of common-sense reasoning has nowadays attracted the attention from many researchers,most of the existing researches focus on the field of natural language understanding.There is a scarcity of work regarding how to evaluate the common-sense reasoning for generation systems.The purpose of this paper is to investigate the weaknesses of existing generation systems from the perspective of commonsense reasoning by constructing a targeted corpus,thus contributing to the development of related work in this area.Specifically,this paper investigates the following three aspects:(1)A translation evaluation dataset about common sense problems is proposed for Chinese-English translations.Existing machine translation tasks are often trained and evaluated on a large-scale corpus,but this approach does not tell us a clearly what specific problems still exist in machine translation.Moreover,since there is no previous research on common sense reasoning in the field of neural machine translation,this study also help promote the development of researches on common sense reasoning in machine translation.(2)For Chinese,a large dataset about model text generation on data is proposed.A common-sense inquiry into the sentence generation for Chinese text generation models enable us to understand the unique aspects of the Chinese language and advance common-sense research in Chinese.It also extends the linguistic domain of common sense reasoning tasks and promotes the diversity in this direction.(3)A new meta learning sampling algorithm is proposed,which helps the model to sample more tasks that are most helpful for the target task,thus improving the effectiveness of multitasking scenarios for assisting the target task.Also,we apply the meta learning algorithm on cross-language settings,which helps to improve the generalization of the model on cross-language problems.
Keywords/Search Tags:commonsense machine translation, pre-trained language models text error correction, natural language processing
Related items