Font Size: a A A

Chinese Word Embeddings Based On Neural Network Approaches

Posted on:2018-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S K LiuFull Text:PDF
GTID:2348330536960874Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Natural language processing is an important direction in the field of Computer Science and Artificial Intelligence,word semantic representation is the basic work in Natural language processing.The traditional one-hot representation which express each word as a long 01 vector,can't capture any semantic information.With the development of Deep learning and Representation learning,the technology of distributed word representation based on neural network has more and more attention.Distributed word representation,also known as word embeddings,these high-dimensional,real-valued vectors,which can be used to capture semantic and withstand ambiguity.The previous research of learning word embeddings of Chinese,often directly using the way of processing English,ignores the particularity of Chinese.In Chinese,a word is usually composed of several characters,the semantic meaning of a word is related to its composing characters and contexts.Previous studies have shown that modeling the characters can benefit learning word embeddings,however,they ignore the external context characters.In this paper,we propose a novel Chinese word embeddings model which considers both internal characters and external context characters.In this way,isolated characters have more relevance and character embeddings contain more semantic information.Therefore,the effectiveness of Chinese word embeddings been improved.Experiments show that our model outperforms other word embeddings methods on word relatedness computation,analogical reasoning and text classification tasks,and our model is empirically robust to the proportion of character modeling and corpora size.
Keywords/Search Tags:Natural Language Processing, Word Embedding, Neural Network, Representation Learning
PDF Full Text Request
Related items