Font Size: a A A

Web-based Bibliographic Information Automatically Collected And Services

Posted on:2009-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:J X ChenFull Text:PDF
GTID:2208360245976613Subject:Education Technology
Abstract/Summary:PDF Full Text Request
Digital book data(for other is book data information) means the metadata of the book data which exist as digital form, the construction of book data is one of the most important parts of the resources building of digital library. In traditionally, this work need people input the book data one by one, it is a hard work and needs much manpower, it also take some problems for the veracity and the timeliness of book data. In fact, there are a lot of ready book data on the web which can be used, such as the authors and readers' blog, the publishers' websites, the search system of digital library, each of them is likely to provide a lot of ready book data. Web-Based auto collection and service system of book information can collect book information automatically, through processing, sorting, analyzing, saving and indexing it can supply a metadata service for the users.This paper researches the key technologies and the realization of the Web-Based auto collection and service system of book information, the concretely contents are as follow:How to find the target websites and web pages which include the book data; How to extract the data from target web pages; How to transform these book data to MARC metadata; How to use these metadata to serve for the users.This system construct a knowledge library of book subject, based of the characteristic of the books' information which exhibit on the web. Integrate with some computer technologies, such as focus web spider, noise reduction, information extractor and so on, extract semi-structured books' information which on the web and put them into structured database. Then generate these structured book information as MARC through MARC-generate technology. Use the web technologies and full text search to supply information search and download service for the user.The research and development of this project has supply a useable solution to solve the closed book catalog. Make use of rich books' information on the web and latter-day computer technologies to make book catalog more opened, and development of digital library can benefit from it.
Keywords/Search Tags:Digital Library, MARC, Focus Spider, Information Extraction, Full Text Search
PDF Full Text Request
Related items