Font Size: a A A

Research On Diversified And Deep Aggregation Of Digital Literature Resources

Posted on:2015-08-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:K DongFull Text:PDF
GTID:1318330428974801Subject:Information Science
Abstract/Summary:PDF Full Text Request
Library has become such a major constituent to the public infrastructure of societal culture that it is a positive impetus to the development of the culture industry. Whether it is in the theoretical research field of library and information science or the development of library service in the real world, knowledge service is gaining increasing attention. How to level up the knowledge service ability of library becomes a pivotal issue. Meanwhile, with substantially accumulated digital literature resources, how to efficiently use these digital literature resources gradually becomes an ad hoc research issue for improving knowledge service ability. Based on the above context, this paper is proposed on the topic named "research on the diversified and deep aggregation of digital literature resources".This paper applies an interdisciplinary approach by combining semantic mining, natural language processing, informetrics, and network structure analysis. It constructs a theoretical system for the diversified and deep aggregation of digital literature resources with a discussion of its diversity and depth from multi-aspects. It innovatively recognizes that the theory of diversified and deep aggregation of digital literature resources takes a role in cross-bridging informetrics and information retrieval. In addition, this paper presents technical procedure and methods for the diversified and deep aggregation and observes their practical implication by applying them to two classical types of networks such as citation network and author networks. The research of this paper makes painstaking efforts in seeking for alternative approaches to the well-known knowledge organization, with aims to discover the semantics of resources, and to enhance the knowledge service of library, as well as to contribute to the study of digital library development with interdisciplinary approach.Based on the notion "integrating theory with practice, and practice as the touchstone of truth", this paper systematically carries out the research from studying the basics of theory, theoretic system designing and technical procedure formation, to the empirical analysis of classical integration examples for the justification of the research of this paper. The first chapter of this paper (Chapter0) is the introduction part concluding the research context, research value, national and international state of the art researches, technical procedure and purposes of this research. The last chapter (Chapter6) is the conclusion part of this research. The rest chapters are consisted of five chapters listed as below:Chapter1is the study of the basics of theories for the research to answer the question why to carry out the research and its theoretical support. The first step of this chapter analyses the concept of digital literature resource and its inclusive components. It takes next step to elaborate on the relationship between aggregation and the ways of information integration as the pre-stage of aggregation and to observe the relationship between aggregation and the scientific domain "information retrieval" as is the most closely related with the current research. Followed by the analysis of concentration and discrete distribution in informatics as the fundamental prerequisite for performing aggregation, this chapter recognizes that diversified and deep aggregation of digital literature resources yields a cross-bridging relationship between informetrics and information retrieval.Chapter2explicitly illustrates the so called diversified and deep aggregation of digital literature resources and frames the theories embedded in this research topic. The diversified and deep aggregation of digital literature resources is the process interconnecting users and the knowledge world which is represented through resources warehouse. The theoretical model of the diversified and deep aggregation of digital literature resources has four components i.e. object category for aggregation, networks of objects, approaches for level measurement, and purposes of aggregation. These components reflect a diversified aggregation of digital literature resources. The aggregation based on metadata to the aggregation based on metrics and to semantic aggregation is a gradual deepening process of different levels. The established theoretical framework of this chapter serves as the backbone for the following technical procedure study.Chapter3studies technical system of the research of this paper. It constructs a general framework for the technical procedure of the diversified and deep aggregation and explicitly decomposes the general framework into three core technical issues:the importance assessment of objects, the achievement of resources aggregation, semantic revealing of resources aggregation. It compares local indictors and global indicators for importance assessment, and discusses the advantage of using global indicators for importance assessment. With a view of vertex-based and cluster-based methods, it innovatively proposes method of relative importance aggregation. The deepening technical route for the discovery of semantics starts from word frequency, to co-words, and to topic model. It is found that topic model is more flexible and operational.Chapter4is the realization of the diversified and deep aggregation of core resources from citation network which is a classic directed binary network. This chapter systematically analyzes the problems in the performance of assessing importance through traditional citation network. Thus it suggests improving the indicators and methods. It applies the technical procedure of aggregation to the citation network of XML research papers and turns out that method of relative importance aggregation yields good performance in revealing a rich and hierarchical result. By the semantic analysis of the resources, the semantic contents in the aggregation are substantially revealed. This result proves the validity of the theoretical and technical framework.Chapter5is the realization of the diversified and deep aggregation of author networks. Since author networks have various types, once the procedure of aggregation of authors is established, the aggregation for the other objects can be assured. This chapter states in details about the problems in the data description of authors as well as the data cleaning strategy. It builds six kinds of networks: collaborative network, author cross-citation network, author co-citation network, bibliographic coupling network, words-based coupling network and sources-based coupling network, and analyzes the semantic interrelationship among these networks. By applying author topic model, the semantic topics of authors are distributed for discovery. It is found that the aggregations for collaborative network, author co-citation network and bibliographic coupling network have distinct features but aggregations for author cross-citation network, words-based coupling network and sources-based coupling network have limitations. The diversified and deep aggregation of resources from author networks offers technical support in different aspects to satisfy the user's requirement in knowledge discovery.
Keywords/Search Tags:Digital Library, Digital Literature Resources, Diversified and DeepAggregation, Semantic Topics, Relative Importance Aggregation, Informetrics, Network Structure analysis
PDF Full Text Request
Related items