Font Size: a A A

Research On Transfer Learning Based On Latent Semantic Analysis

Posted on:2015-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:M ChenFull Text:PDF
GTID:2348330518970449Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Through studying that how computer to simulate human's behavior of learning, machine learning can acquire new knowledge or skills, which can be reorganized to improve its own performances continuously. But in machine learning the training and testing data must be governed by the same distribution. This has brought a great trouble to the practical application.Information update rapidly. When a new field appears, usually the sample data are less and the features are sparse. It would lead to a great generalization error if we still classify data using the traditional machine learning.Transfer learning is a kind of method crossing domains and crossing tasks. If there are less data with label, individual learning can not get a good performance. From the previous fields and tasks, transfer learning can distinguish the useful knowledge and skills, which can be applied to the new fields and tasks to improve the performance of target classification tasks.This characteristic makes that the transfer learning can well solve the problem of data's sparsity in machine learning.The existing transfer learning methods, whose source data must be given in advance,either only considered the surface information of the text, or only considered the structure of the data. In order to solve the problem,this paper presents a transfer learning method based on latent semantic analysis. Firstly, input the keywords extracting from the target texts into one search engine, and select the most relevant data. Extract keywords associated with the source domain as seed feature sets using method based on latent semantic analysis. Then construct the undirected graph of social media. Extract the subgraph contains all the seed feature sets.Using the extended method of Laplacian Eigenmap, each node can be mapped into a lower dimensional space. Every label can get a new feature representation. Lastly use the classification of SVM algorithm to classify the target test data.The results show that it can get a good result,and in this method it does not need to give the source domain data, which reduces the burden of the problem's designer.
Keywords/Search Tags:Transfer Learning, Latent Semantic Analysis, Seed Feature Sets, Laplacian Eigenmap
PDF Full Text Request
Related items