Font Size: a A A

Research On Automatic Acquistion Method For Open Access Journal Papers

Posted on:2013-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:D Q WangFull Text:PDF
GTID:2248330362462675Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
When we create the digital resource library based on the OA (Open Access) journalsfor information source on the Internet, automatic acquisition papers in the OA journals isthe key to generate the digital resource library. Because the distribution of OA journal sitesscattered in various parts of the network, the performance of OA journals is constrained.Therefore, the automatic acquisition for OA journals resources is the current hot researchproblem for in the digital resource library construction. We, on the basis of previousresearch, do many researches in those aspects.Firstly, the traditional framework for network information acquisition can not bedirectly applied to the collection of OA Journal issue because of different data sources fortarget acquisition. We propose a framework that automatically collects open access journalarticles. First, we design an overall architecture, modules fro the OA Journal automaticacquisition system, in addition to describing the overall relationship between each module;Second, we describe the system’s workflow and system’s performance indicators andworks; finally, we discuss the design ideas of the main module and the key issuesencountered in the framework and solutions.Secondly, on the basis of careful analysis and research on the structure of a largenumber of OA journals in the Web, we propose a links extraction approach for issues’table of contents based on web page segmentation and link features. According to HTMLtag Table and Div, a page is partitioned into lots of blocks, and then based on the similarityof the sub-tree merge sub-block of the block into a semantic block, and finally accordingto the characteristics of table of contents link for issues to identify the table of contentslink, the experiments proved that the method can effectively extract links from issues’table of contents.Finally, based on those works, by the experiment on the prototype system wedeveloped, the average time-consuming operation of the system and extraction accuracyand links Extraction Approach for issues’table of contents the paper algorithms isanalyzed and evaluated.
Keywords/Search Tags:Open access, OA journal, Automatic acquisition, Acquisition framework, Issues’table of contents, link extraction
PDF Full Text Request
Related items