Font Size: a A A

Research And Implementation On Open Access Journals Resource Acquisition System

Posted on:2018-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z HuangFull Text:PDF
GTID:2348330518984803Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the development of the Open Access movement,more and more journals become open access.Open access journals are peer-reviewed journals that are of great academic value.The reseach on Open Access journals resource acquisition is the basis of effective utilization of resources.For open access journals which follow the OAI-PMH protocol,the OAI-PMH interface is usually used to collect periodical resources.For the open access journals that do not follow the OAI-PMH protocol,the metadata information is usually collected in the journal web pages.However,the open access journals are organized and displayed by the journal institutions,and different journals adopt different forms of resources.The same periodicals may also have different forms of resources in different periods,which can be described as "thousand journals thousand faces".The emergence of resources to change the form of OAI-PMH protocol does not follow the open access to the acquisition of resources has brought a certain degree of difficulty.In order to solve the problem,this paper makes a series of research on open access journal resources and its acquisition methods and system realization.First of all,based on a large number of acquisition investigation with the domestic and foreign open access journals,this paper summarizes the characteristics of open access to journals and has the characteristics of small granularity,complex description and description of carrier structure,and divides it into the single resources and the combination resources according to the resource organization form.This paper makes a comparative analysis of the main acquisition methods and the application of each method in the acquisition of open access journals.Based on the comparative analysis of the current main resource acquisition methods,combined with the characteristics of open access journals,this paper puts forward a method for resource acquisition of open access journals.Then,this paper compares the current main commercial acquisition tools and their application in the open access journals resources,and analyzes the necessity of the open access journal resource acquisition system research and development after analyzed the target of the open access journals acquisition system.Combined with the method of open access journal resource acquisition,this paper makes a detailed analysis of the demand and the overall design of the system.The system is mainly divided into three modules: user interaction module,data acquisition and web structure inspection module and data storage module.The main functions of the system include the function of visual information collection,automatic acquisition rules,multi thread automatic collection,web structure inspection,and data quality test and so on.Then,the concrete realization of the three modules and the main function points of the system is introduced in detail,and the specific function of the system is realized by coding,and tests from two aspects of system function and performance.In functional testing,the system can collecte the single type of resources and combination of resources,can accurately identify the changes web page structure,and will give the user re selection and acquisition structure changes after page feedback system has the basic function of resource acquisition.In performance test,by comparing the results with the octopus system through acquisition of the same periodical resources,the results show that the system is better than the octopus collector in the recall and accuracy.The system also collect the 12 open access journal websites which does not follow the OAI-PMH protocol.49660 papers were collected,the total cost of time is 31659 seconds,the average per thousand article acquisition time spent for 10.62 minutes.The number of papers collected by the system plus the number of dirty pages marked by the user is exactly the same as the number of links collected by the crawler script.It shows that the system can satisfy the demand of open access to periodical resources,and verify the effectiveness of the method of open access to periodical resources collection.Finally,this paper summarizes the main contents of the paper,as well as some shortcomings of the system.And the next step of the work carried out the prospect.Open access to journals resources acquisition is the first step and the most basic step in the open access to the use of journals resources.It is necessary to have a series of tasks such as data cleaning,data warehouse construction,and data analysis platform and data visualization.
Keywords/Search Tags:Open Access journals, Open Access journals acquisition, Web information acquisition, Metadata acquisition
PDF Full Text Request
Related items