Deep Learning Based Chinese Pinyin Input Method

Posted on:2020-11-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Huang

Full Text:PDF

GTID:2518306185999929

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Chinese pinyin input method engine(IME)converts pinyin into character so that Chinese characters can be conveniently inputted into computer through common keyboard.IMEs work relying on its core component,pinyin-to-character conversion(P2C).In recent years,deep learning has been widely used in various natural language processing tasks.However,the research of applying neural networks to development of input method editor(IME)is almost blank.This paper mainly analyzes the feasibility of applying deep learning technology to the research of Pinyin input method,and proposed four methods to improve user experience of IME : neural P2 C conversion,online vocabulary updating,pre-training model and aided IME.We introduce the four methods in detail as follows:Using a neural P2 C conversion model based on sequence-to-sequence(Seq2Seq)framework for P2 C.P2C is the core component of pinyin-based IMEs.As we regard the pinyin sequence as a language,the P2 C can be naturally formulized into a machine translation task.Experiments show that this method can improve the quality of the P2 C conversion compared with the traditional method.Using some multi-granularity word embedding enhancement methods to augment the representation learning of P2 C.We proposed character-enhanced and subwordenhance embedding for the core task in IMEs.In addition,we proposed gated-attention mechanism.The proposed neural P2 C model is learned by encoding previous input utterance as extra context to enable our IME capable of predicting character sequence with incomplete pinyin input.Our model is evaluated in different benchmark datasets showing great user experience improvement compared to traditional models.Using the adaptive dictionary update algorithm with the target vocabulary sampling mechanism and an online learning training method to realize an open vocabulary learning on neural IME.Our experiments show that the proposed approach indeed helps our IME effectively follows user inputting behavior.We present Moon IME,a pinyin IME that contains a high-quality P2 C module and an extended information retrieval based module.The former is based on an attentionbased NMT model and the latter contains follow-up-prediction and machine translation module for typing assistance.With a powerful customizable design,the association cloud platform can be adapted to any specific domains including complex specialized terms.Usability analysis shows that core engine achieves comparable conversion quality with the state-of-the-art research models and the association function is stable and can be well adopted by a broad range of users.It is more convenient for predicting complete,extra and even corrected character outputs especially when user input is incomplete or incorrect.The released IME is implemented on Windows via text services framework.

Keywords/Search Tags:

Pinyin input method, Deep learning, Aided input

PDF Full Text Request

Related items

1	Research And Design Of Pinyin Input Method For Chinese Teaching In Pirmary And Secondary Schools
2	Design And Implementation Of Intelligent Pinyin Input Method Based On Android Platform
3	Design And Development Of Pinyin Input Method Of Feature Phone
4	Research Of Pinyin Input Method For Non-Chinese Native Chinese Learners
5	Design And Implementation Of Speed Intelligent Pinyin Input Method
6	Research And Implementation Of Pinyin Input Method Based On Language Model
7	The Continuous Chinese Pinyin Input System Based On Slide Track
8	A Pinyin Input Method Editor With English-Chinese Aided Translation Function
9	Agent-based Negotiation In Intelligent Pinyin Input Method
10	Design And Implementation Of Pinyin Input Method Client Base On Text Services Framework