Font Size: a A A

The Application And Research Of Latent Semantic Analysis In The Field Of Internet Data Mining

Posted on:2010-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:C H TangFull Text:PDF
GTID:2178360275982440Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet search has now become an indispensable part in people's study and life. Reasonable organization and fast and efficient access to the Internet knowledge, mining of the Internet link structure, and personalized recommendation of the infor-mation system all affect, to a large extent, the Internet users'searching experience. Meanwhile the explosive growing of information today has made a large number of Internet users lost in the ocean of knowledge. Therefore, Internet data mining is of great practical significance in improving people's study and life.The present study introduced the research on Internet data mining, especially the link structure mining and use mining. We also analyzed in depth the mathematical model and the realization principle of the latent semantic analysis (LSA) technology. Based on the research of Internet data mining, the present study presented two algo-rithms: the improved LSA-based HITS algorithm and the LSA-based personalized recommendation algorithm. We made deep analyses of the description of the im-proved HITS algorithm, its parameter setting and realization, the time and space complexity, the results of the experiment and so on. We also deeply analyzed the newly proposed personalized recommendation algorithm in several aspects such as the description, the system architecture, the indexes of the algorithm performance, and the experiment of the algorithm. And then with the help of several contrast experi-ments, the performance of the two algorithms has been evaluated. Finally, we pre-sented the system design of our work. The design presented took into account not only the realization details of the system functions, but also the scalability and maintain-ability of the system, and the reuse of the existing code.The experimental results showed that compared with the original algorithm, the improved LSA-based HITS algorithm had a better recall ratio and more feasible time efficiency, and the results returned by the improved HITS algorithm were generally more authoritative with more reference value. We also employed the LSA-based personalized recommendation algorithm to mine the similarity between users and users, and between resources and resources in low-dimensional semantic space and low-dimensional resource space. Through reasonable recommendation strategies, we found that the personalized recommendation system realized through the LSA-based personalized recommendation algorithm showed fairly ideal recommendation per- formance, when our experiment samples are not very large.In short, with the support of LSA, the time efficiency of the improved LSA-based HITS algorithm and the LSA-based personalized recommendation algorithm has been improved, and because of the statistics-based semantic support, they can deal with information processing in low-dimensional semantic space, not only having improved the space efficiency of the algorithm but also the accuracy of information processing.
Keywords/Search Tags:Internet Search, Latent Semantic Analysis, Singular Value Decomposition, HITS algorithm, Personalized Recommendation
PDF Full Text Request
Related items