Font Size: a A A

Research On Enhanced Word Embedding Learning Model With Fusion Of Part-of-Speech And Position Information

Posted on:2018-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q L LiuFull Text:PDF
GTID:2428330569475206Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,both the size and the complexity of the data people could acquire are growing rapidly.Due to the use of One Hot Representation vector which has dimension disaster problem and semantic gap problem,the traditional Machine learning methods based on feature engineering have reached a bottleneck in performance.With the success of deep learning technology in the natural language processing field,word embedding representation technology,which acts as the basis of deep learning,has been studied extensively.At present,most researches on word embedding focus on how to optimize network structure of models to reduce the complexity of the model or using cross-language information,emotional information and other factors to enhance word embedding.The part of speech(POS),as a fundamental element in natural language,has rarely been taken into account,therefore in order to make full use of POS factor the enhanced skip-gram word embedding model that combines POS information and location information is proposed.Using the existing POS tagging tool to perform word tagging,and the POS association and location information between words is modeled by constructing POS relationship matrices,and then using the constructed POS and position information to help predict the conditional probability of the context word given the target word.So that POS information and location information are utilized to learn better word embedding.And the weights in matrices are updated with other Parameters during the model training process.Finally,we conduct experiments on word analogy task and word similarity task with word embedding trained on different training sets respectively.The result shows that the proposed model has a certain degree of improvement in both two tasks.Especially for the infrequent words learning,the proposed model has a more significant improvement,indicating that the POS information and location information have an important role in word embedding training.
Keywords/Search Tags:word embedding, part of speech, POS relevance matrices, position, skip-gram model
PDF Full Text Request
Related items