Font Size: a A A

Machine Learning Algorithm-based Metaphor Recognition

Posted on:2012-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:J K LiuFull Text:PDF
GTID:2218330338974180Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As one of the intractable problems in field of NLP(Natural Language Processing), metaphor has attracted more attention from researchers in recent years. And researchers have realized that it is the focus of mind and language mechanism. Metaphor is to express one thing in terms of another based on some similarities between the two things. It is not only a rhetorical devices of language, but also embodies people's analogical cognitive and way of thinking. In fact, the metaphor is prevalent phenomenon in all of the natural language. Also the metaphor problem can not be avoided in NLP field. So, if the problem is not well resolved, it will become a bottleneck of NLP and machine translation development.In recent years, machine learning methods and automatic large-scale knowledge acquisition become popular in metaphor recognition. We select metaphor calculation as research subjects in Chinese text and metaphor recognition as research contents. In this thesis, we use many machine learning algorithms to study nominal metaphor and verbal metaphor and to explor wildly many methods of metaphor recognition.The thesis chooses 20 metaphor words and uses the 2001 to 2004 the "People's Daily" Corpus to study metaphor recognition. The details are as follows:Metaphor recognition based on classification algorithm. Basing on RFR_SUM, SVM, CRFs, maximum entropy and semantic similarity model based on How-Net, we present some recognition methods to process the problems of nominal metaphor recognition and verbal metaphor recognition. Classification algorithms provide an idea of machine recognition for metaphor recognition, so that we can study performance and effectiveness of mainstream classification models in identifying metaphor. The results show that the recognition performance of RFR_SUM model is relatively stable, because its recognition precision stability is best in the five models. In addition, CRF model recognition precision is slightly higher than SVM. But the best model is the semantic similarity, which combines semantic similarity calculation and the idea of K nearest neighbor algorithm, improving the metaphor recognition precision. Finally, based on observation of the experiment outcome of these models, an additional ensemble method based on majority voting is proposed. The ensemble method obtains nominal metaphor precision of 87.74% and verbal metaphor precision of 85.27%, which is much better than the results obtained in the five models.Metaphor recognition based on clustering algorithm. In the clustering process, we use vector space similarity calculation based on TongYiCi CiLin and semantic similarity calculation based on How-Net to obtain the similarity between samples. Also we adopt the idea of K-means algorithm and optimize the mothed of selecting randomly initial cluster centers. Clustering experiments design three programs to enhance metaphor recognition precision, and the second experiment not only use short distance information but also long distance information, improving experimental results precision.Metaphor recognition based on semi-supervised learning algorithm. We present semi-supervised learning method to metaphor recognition based on combining K-means algorithm and RFR_SUM model. This new method use both labeled samples set information and unlabeled samples set information, its prescion is higher than K-means clustering algorithm and RFR_SUM classification model.Finally, we build a small metaphor knowledge-base for the metaphor computing. Based on the experimental results of metaphor study, we select feature words of metaphor class by using algorithm and sort these feature words by their RFR values, then establish our metaphor knowledge-base based on the structure of Feature-RFR. Furthermore, it is verified that the metaphor knowledge-base is available by metaphor computation experiments basing on our knowledge-base.In short, the research contents in this thesis are mainly based on machine learning and knowledge acquisition, exploring experiment ideas of metaphor identification from some machine learning algorithms, avoiding shortages of mannual knowledge-bases and rule-based methods, accumulating much experimental data of machine learning algorithms in identifying metaphor, obtaining more satisfactory experiment results in metaphor recognition research. The research methods in this thesis can support researches on metaphor computing, metaphor understanding and other related natural language processing work.
Keywords/Search Tags:Metaphor Recognition, Machine Learning, Classification Algorithm, Clustering algorithm, Semi-supervised learning, Knowledge Acquisition
PDF Full Text Request
Related items