Font Size: a A A

Community Of Scientific Papers And Its Application

Posted on:2012-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2208330332486793Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology and the wide spread of Internet, the problem of information overload are getting worse. The contradiction between massive growth of information and human's capacity results in ineffective usage of a great amount of information resources. In the field of scientific research, papers increase exponentially; more often we fail to find our desired paper because the searching result will be accompanied by a large number of useless papers and documents. How to improve documentation resource utilization has become a major problem in scientific research. Scientific literature, as the product of the scientists' hard-working, is the most direct manifestation of the researchers'orientation and creature, playing an important role in the research:for the scientists, the ability to obtain the required Information quickly and accurately affects the research's success directly; for the development of science, the speed of occupation, configuration, development, utilization to scientific documentation resources serve as important factors that determine scientific capability of a country or a region.The purpose of this paper is to provide a good information platform for researchers in the field of computer science, this platform contains literature crawling function, information query function, knowledge mining function and etc. The advantage of this system is organizing and managing the chaotic information of literature unitiedly by use of the Web crawler, which will realize the fast search queries, and facilitate the data statistics and knowledge mining. Based on the B/S structure, the main content of this paper is to achieve a literature-management and knowledge-mining System for Chinese computer field----"Scientific Community" system. This system use.NET architecture to build a fairly standard three-tier architecture, which can achieve the multi-angle query to literature database, making the application of the system more simply and conveniently.This paper uses the template-based web crawler technology to match the unstructured information on Internet, implementing the automatic download of paper information, and ensuring the real-time data renewal of the local database, and achieving the basic retrieval and data statistics.This paper presents a personalized paper recommendation algorithm based on recommended degree. Based on the traditional vector-based recommendation algorithm, this algorithm uses the improved vector space model, taking into account the expression differences of the various parts of the article. Meanwhile, we consider the value of the paper itself, to make the most valuable paper display first; To ensure the effectiveness of the recommendation, the algorithm also incorporates the user's browsing history in the process. The experimental results show that the algorithm can improve the precision and the fallout rate of the recommendation.This paper uses the overlapping-community mining technique to do the community-detection on the self-built collaboration network, to discover the potential fields of the network. The description of the field is reflected by the description of the central node in the community. In this section, this paper reorganizes and compares several different kinds of overlapping-community detection algorithm, and make improvement for the practical situation of this system, sovling the local optimum problem in the mining process. In the construction process of the collaboration network, we emphasize the importance of the first author, which can simplify the network structure, and reduce the algorithm complexity.
Keywords/Search Tags:Data Mining, Scientific Article Community, Community Mining, Document Vector, Personalized Recommendation
PDF Full Text Request
Related items