| Along with the rapid development of Internet, more and more companies use competition intelligence system (CIS) to collect, analyze and manage Web information they need. However, the distributed Web information storage which is the key technology of CIS is confronted with great challenges. How to construct a distributed Web information storage system for CIS with large-scale, efficiency, extensibility, and reliability is a subject that needs to be resolved urgently.This dissertation explores the technology of distributed Web information storage to provide high-availability , high-performance and high-efficiency distributed storage service. Based on the systematic summarization of the relevant work on distributed Web information storage, this dissertation, not only carry a research on the distributed storage mechanism, the Web information storage organization and the Web information version management, but also makes several innovations and achievements, which will be illustrated in detail as follows.The breakthrough of this dissertation could be embodied in the following several respects:1. This dissertation presents a weighted Round-Robin algorithm on the load balance of distributed Web information storage and a module based on it. The module consolidates disk space of individual storage node into a single storage spool and implements automatic distribution and management through the catalog management servers. The module provides users with high- efficiency, reliable storage service by the start topology architecture, and adaptive transfer mechanism of Web information.2. This dissertation presents a Web information storage file structure: PAK-structure. PAK-structure saves on the storage space of the node and raises the efficiency of system through data compression, sorting and information statistic. PAK-structure provides users with available interface service for several system access modes.3. This dissertation presents a version management model of Web information:... |