Font Size: a A A

A Study On Improving Multi-prototype Word Embedding

Posted on:2017-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:B TangFull Text:PDF
GTID:2348330518494679Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the increasingly advancing development of neural network algorithms and parallel computing technology,representation of text regain people's attention.As a basic problem in Natural Language Processing,how to model abstract human language is an inevitable difficulty.In recent years,the exponential development of the Internet data makes it more prominent.Neural network based probabilistic language model aims to learn word representations to tackle this problem.These kinds of models can not only take advantage of large corpus information,but also endow people with great efficiency in acquiring semantic and syntactic reserved word representation via different optimizing methods to reduce time complexity.Learned word representations lay solid foundation for other NLP tasks.Word representations are widely used in NLP tasks like information retrieval,sentiment orientation analysis,machine translation and etc.and achieved extraordinary performance.Still word representations models have a lot more to improve.Based on this,this paper conducts following work.First,this paper investigated representations learning approaches and optimizing strategies in terms of lowering time complexity and improving better performance.Proposed multi-feature combined word representations learning approaches,which involves POS tag information,position weighting factors,paragraph vectors.Validated this model through word analogous experiments and proved better in overall accuracy than the original model.Second,found that conventional models lack ability in distinguishing anonyms.Studied the reason leading to this phenomenon and validated our method on our synonym/anonym test data.Third,this paper proposed and implemented online learning multi prototypes word embeddings based on Skip-gram to represent different meanings of a word respectively,which was further improved by multi-feature combined strategy and achieved results similar to state of the art performance.
Keywords/Search Tags:representation learning, multi-feature combination, expectation maximization, multi-prototype word embedding
PDF Full Text Request
Related items