Lexical Semantic Relationship Prediction Based On Word Vector

Posted on:2019-07-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Pan

Full Text:PDF

GTID:2348330569495772

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Natural Language Processing is a significant research direction in computer science and technology and artificial intelligence field.There exists a complex relationship between the vocabularies,syntactical structure and the meaning of text in human language.As the researches develops deeply in recent years,lots of researchers has focused on the research of semantics between vocabularies.And the training method based on word vector proposed by Mikolov has indicate a new direction of research.With the unsupervised learning method,the simple subtraction of word vector contains verifiable vocabulary semantics relationship,such as vector subtraction king-man ? queen-woman.He pointed out that such vector subtraction can only answer 40% problem of SSemEval-2012 Task 2.The researches which are based on the residual relationship vector and got achievements are focusing on some simple semantics relationships like tenses,voice and hyponymy.While the complex relationships like whole part relationship and event relationship are waiting to be researched.Together with the above problems,this thesis proposes three kinds of prediction model based on word vector(Word2Vec and GloVe)to digging the complex relationships like whole part relationship and event relationship.What's more,it can also verify the applicability of relationships of tenses and voice.The training data of this thesis are based on the training word vector of Wikipedia,which confirms there has no emphasized relationships in those data.Depending on the order of training collections' relationspecific vector offset's sorting clustering,clustering first and sorting first model are proposed.Clustering first clusters the object relationship vector under the unsupervised learning method,transform the relationship vector with tag character and predict the relationship by sorting algorithm.Sorting first sorts the relationship vector by tags,use clustering algorithm and negative sampling model to learning the common relationship vector and predict the relationship by sorting algorithm finally.There are total 9 kinds of vocabulary relationships verified by the two models and the average accuracy is over 95%.For the part of relationship with transitivity,this thesis obtains 6 kinds of part of relationship induce word relations by promoting spectral clustering.This thesis adopting the segment predict method and negative simpling model to mine the candidate word of part of relationship automatically.If the candidate word is absence,adding network data to supply the candidate word can resolve the problem.Finally,using the prediction model to filter candidate words.The whole process is carried out under the open corpus.The precision rate of using a single model can reach 84%.Under the multi model optimization strategy,the precision rate is raised to 90%.

Keywords/Search Tags:

NLP, Relationship Prediction, Word Vector

PDF Full Text Request

Related items

1	Research On Entity Extraction In Signal Processing Based On Dependency Word Vector
2	Recognition Of Exon And Intron And Regression Prediction Based On Support Vector Machine
3	Research On The Method Of Sensitive Word Vector And Sentiment Classification Based On Deep Learning
4	The Design And Implementation Of Relationship Graph System Based On The Network Film Critic Text
5	Research On Relationship Of Academic Cooperation And Implementation Of Cooperation Relationship Prediction System
6	Research On Word-vector-representation-based New Word Discovery And Name Entity Recognition
7	Multiple Documents Automatically Summary Based On Semantic Word Vector
8	Research Of The Alignment Between Features Of Space Relationships In 2D Images And Describing Words
9	Research On The Influence Of Relationship Intensity On Internet Word-of-Mouth Propagation Effect
10	Multi-prototype Word Vector Based On Context Word Embedding