Font Size: a A A

Research On Translation Quality Estimation Technology Combining Multiple Features

Posted on:2022-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:T Y LiFull Text:PDF
GTID:2518306329983759Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has made significant breakthroughs and the performance of machine translation has improved rapidly.The evaluation of translation quality is an important research topic in machine translation.Translation quality estimation technology refers to the method of evaluating machine translation without reference translation,which is of great value to computer-aided translation.Currently,state-of-the-art method in this domain is the neural translation quality estimation method fused with deep learning technology.Compared with traditional translation quality estimation methods,this method can better learn bilingual features.However,the neural translation quality estimation model still has some problems to be solved,mainly divided into the following two aspects.Firstly,the corpus for training the translation quality estimation model needs to be manually annotated,and the cost is high,so the scale is small,which causes the model to be limited by the corpus and results in insufficient bilingual information extraction.Secondly,neural network has good ability in feature self-learning,but it regards bilingual sentences as word sequences,and cannot effectively capture the deep lexical and syntactic information within the sentences from the perspective of linguistics.This paper proposes a neural translation quality estimation method combining multiple features.Through extracting different features at multiple levels,we aim to alleviate the above problems.Focusing on the first problem,we extract word prediction features and pre-training language model features,which can introduce prior knowledge for the translation quality estimation model,and overcome the over-fitting phenomenon of downstream models.Focusing on the second problem,we extract part-of-speech features and syntactic features from the perspective of linguistics,and add them to the neural network in the form of external knowledge,enriching the lexical and syntactic information contained in the quality vector.We performed feature evaluation and filtering on features of different levels through experiments.Then the features are merged through different network models,and the effects of different feature combinations are analyzed from multiple angles.Finally,this paper uses an ensemble learning algorithm to integrate multiple effective sub-models to obtain the model with the best generalization performance.
Keywords/Search Tags:Translation Quality Estimation, Multi-Feature Combination, Linguistic Knowledge, Part-of-Speech Feature, Syntactic Features
PDF Full Text Request
Related items