A Study On Improving Multi-prototype Word Embedding

Posted on:2017-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:B Tang

Full Text:PDF

GTID:2348330518494679

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the increasingly advancing development of neural network algorithms and parallel computing technology,representation of text regain people's attention.As a basic problem in Natural Language Processing,how to model abstract human language is an inevitable difficulty.In recent years,the exponential development of the Internet data makes it more prominent.Neural network based probabilistic language model aims to learn word representations to tackle this problem.These kinds of models can not only take advantage of large corpus information,but also endow people with great efficiency in acquiring semantic and syntactic reserved word representation via different optimizing methods to reduce time complexity.Learned word representations lay solid foundation for other NLP tasks.Word representations are widely used in NLP tasks like information retrieval,sentiment orientation analysis,machine translation and etc.and achieved extraordinary performance.Still word representations models have a lot more to improve.Based on this,this paper conducts following work.First,this paper investigated representations learning approaches and optimizing strategies in terms of lowering time complexity and improving better performance.Proposed multi-feature combined word representations learning approaches,which involves POS tag information,position weighting factors,paragraph vectors.Validated this model through word analogous experiments and proved better in overall accuracy than the original model.Second,found that conventional models lack ability in distinguishing anonyms.Studied the reason leading to this phenomenon and validated our method on our synonym/anonym test data.Third,this paper proposed and implemented online learning multi prototypes word embeddings based on Skip-gram to represent different meanings of a word respectively,which was further improved by multi-feature combined strategy and achieved results similar to state of the art performance.

Keywords/Search Tags:

representation learning, multi-feature combination, expectation maximization, multi-prototype word embedding

PDF Full Text Request

Related items

1	Multi-prototype Word Vector Based On Context Word Embedding
2	Research On Word Spotting Technology In Handwritten Historical Document Images
3	Research On The Representation Of Word Embedding Based On Knowledge Fusion
4	Research On Feature Representation Learning Based On Graph Embedding
5	Retinal OCT Image3-D Segmentation Based On The Combination Of Fuzzy C-means Algorithm And Expectation Maximization Algorithm
6	Feature Representation And Neighbor Embedding Based Image Super Resolution
7	Knowledge Representation Learning Based On Multi-source Information Combination
8	Human Pose Estimation Based On Relevance Constrained Learning
9	Representation Learning Based Word Embedding Extraction And Its Application On Sentiment Analysis
10	Sentence Embedding Representation With Syntactic Information Learning Method And Application Research