Research On Enhanced Word Embedding Learning Model With Fusion Of Part-of-Speech And Position Information

Posted on:2018-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:Q L Liu

Full Text:PDF

GTID:2428330569475206

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,both the size and the complexity of the data people could acquire are growing rapidly.Due to the use of One Hot Representation vector which has dimension disaster problem and semantic gap problem,the traditional Machine learning methods based on feature engineering have reached a bottleneck in performance.With the success of deep learning technology in the natural language processing field,word embedding representation technology,which acts as the basis of deep learning,has been studied extensively.At present,most researches on word embedding focus on how to optimize network structure of models to reduce the complexity of the model or using cross-language information,emotional information and other factors to enhance word embedding.The part of speech(POS),as a fundamental element in natural language,has rarely been taken into account,therefore in order to make full use of POS factor the enhanced skip-gram word embedding model that combines POS information and location information is proposed.Using the existing POS tagging tool to perform word tagging,and the POS association and location information between words is modeled by constructing POS relationship matrices,and then using the constructed POS and position information to help predict the conditional probability of the context word given the target word.So that POS information and location information are utilized to learn better word embedding.And the weights in matrices are updated with other Parameters during the model training process.Finally,we conduct experiments on word analogy task and word similarity task with word embedding trained on different training sets respectively.The result shows that the proposed model has a certain degree of improvement in both two tasks.Especially for the infrequent words learning,the proposed model has a more significant improvement,indicating that the POS information and location information have an important role in word embedding training.

Keywords/Search Tags:

word embedding, part of speech, POS relevance matrices, position, skip-gram model

PDF Full Text Request

Related items

1	Research On Jointly Learning Word Embeddings And Latent Topics In Text
2	A Study And Implementation Of Document Clustering Based On Word Embedding
3	Improving Word Vector Model With Part-of-Speech And Dependency Grammar Information
4	Research And Implementation For Part-of-speech Taggingapply Inautomaticenglish Essay Scoring
5	Research And Application Of Multilingual Text Embedding Model
6	Research On Chinese Short Text Classification Based On Word Embedding
7	Research On Chinese Part-of-speech Tagging Based On Semi Hidden Markov Model
8	The Effect Of Part Of Speech On Chinese Word Segmentation
9	Chinese Word Found Its Part Of Speech Tagging
10	Ambiguity Word Processing Mechanism, The Combination Of Rules And Statistics