Font Size: a A A

A Study On Language Models Based On Neural Networks

Posted on:2017-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q LuoFull Text:PDF
GTID:2348330518495542Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the study of natural language processing,words and sentences are the main research units.Words are generally the smallest meaningful units of text processing,such as search engines usually cut the search query into words and then find out related documents using the words.Sentence is a higher level unit than word.If we do not define the length of the sentence,then the sentence may become a paragraph or chapter.Because words and sentences are the main units of text processing,the studies on representation of words and sentences are very important.There are two ways to learn the representation of words,the first one are called distributed methods,such as neural language models,the second one are distributional methods,such as LSA,LDA et al.We can learn the representation of sentences using vector space model based on TF-IDF and LDA topic model.Language models based on neural networks can be used to learn the representation of sentences unsupervised.The main work includes the following aspects.Firstly,this paper proposes the inverse word frequency Huffman encoding,which is used in neural language models for hierarchical softmax.Neural language models commonly use hierarchical softmax based on Huffman encoding and negative sampling for acceleration.The word2vec model,uses word frequency for Huffman encoding.This means that the higher the word's frequency,the shorter the length of its code.We suppose this is unreasonable,so we propose inverse word frequency Huffman encoding.We also study the position dependent weight problem existing in neural language models and as a result,we use position dependent weight vectors and position dependent weight factors to improve the word2vec model.We propose to share context representation and target representation in word2vec model.As a result,we find that we can get better word vectors by sharing context representation and target representation.Secondly,we propose D-CBOW model,which can learn paragraph vectors and word vectors.Different with Quoc's method,D-CBOW uses the paragraph weight vector and the position dependent weight vectors to merge paragraph vector and word vectors.Finally,we study how to use our methods for sentiment analysis.We have followed the work proposed by Quoc on ICML 2014 conference.The results show that position dependent weight vector and inverse word frequency encoding can make the paragraph vector model better.We also compare different activation functions on sentiment analysis task and find that relu is better than other activation functions.
Keywords/Search Tags:neural networks, word vector, paragraph vector, weight vector, inverse word frequency encoding
PDF Full Text Request
Related items