Font Size: a A A

Comparative Research On Subject Detection Methods

Posted on:2022-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:J L LiFull Text:PDF
GTID:2518306326952819Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the increasing of scientific and technological literature resources,it is a new task and challenge for the intelligence analysts to understand the development trend,discover the research theme and grasp the development law of the discipline from the mass literature data.The methods used by intelligence analysts in topic analysis are different.Comparing the advantages and disadvantages of different methods in theoretical and practical applications is of great reference value for relevant personnel in and outside the subject field to reasonably choose and effectively use topic detection methods to mine text topics.This paper mainly carries out research from the following seven chapters.The first chapter is the introduction,which mainly introduces the background and significance of the topic selection,research status at home and abroad,research objects and data sources,research framework and innovation of the article.The second chapter is related overview,which elaborates the relevant research theories and methods in the process of empirical research and guides the development of subsequent research.The third,fourth and fifth chapters are empirical studies.Based on relevant research theories and methods,the three topic detection methods are respectively applied to the subject topic mining of LIS Subject research topics are discovered and the development trend of the subject is understood according to the topic mining process of different detection methods under the conventional path.The sixth chapter is a comparative analysis.Combined with the process and results of empirical research,the three topic detection methods are compared from four aspects,including method itself,data preparation and result output,and the advantages and disadvantages,similarities and differences and applicability of different methods are summarized to verify the effectiveness of different methods.The seventh chapter summarizes and looks forward to this paper,points out the limitations and deficiencies of the research,and looks forward to the future research direction and content.Research shows that:(1)the analysis of the literature of the LIS discipline for nearly 10 years,a total of the intense word analysis identified five topics,dash forward show word detection reflects the LIS discipline attention under the background of the research subject of epidemic diseases,including led the intense identified 13 topic,reflecting the citation perspective research topics of accumulation and development,don't make the LDA model consensus 19 topic,The 11 core themes and 8 secondary topic were further distinguished,and the themes were divided into growth type,stability type and decline type.(2)total word analysis based on keywords appear cumulative frequency characterization of research topic,total cited analysis according to the citation is drawn between the common frequency of research topics,are reflected in the form of quantitative is focused on the core characteristic of literature and ignore the secondary literature,and pay more attention to the LDA model literature characteristics of the semantic level,weaken the rely solely on the cumulative amount to characterize the tendency of topic.(3)The research of co-word analysis and co-citation analysis requires structured bible information,which relies too much on normative databases and has a rough data processing process.LDA model has low requirements on the format of data text,and the data processing process is more rigorous.(4)the total words and cited analysis with structure network can clearly express relationship between different subjects,but the number of topics to determine based on clear,easy to produce redundant theme,theme expression to describe a broad,the LDA model with Lord document-themes-three layer structure can effectively express vocabulary semantic relation,subject number to determine ways to be more objective,The explanatory information sources of topics are more abundant,but the ability to express the relationship between topics is weak.(5)At present,LDA model can only be realized through programming,which is difficult for ordinary researchers to implement,but it has more obvious advantages in data carrying capacity,text processing capacity,implementation speed and other aspects.In conclusion,the co-word analysis method is suitable for the study of local hot topics in the field of small-scale data exploration,but lacks the expression of topics in the time dimension.The co-citation analysis method is suitable for the research on the knowledge structure and topic evolution of the domain described by medium and large scale data,but it is obviously affected by the time lag of citation.LDA model method is suitable for the overall picture of the subject of large-scale data exploration,but the expression of the relationship between different topics is weaker than the other two methods.
Keywords/Search Tags:LIS, co-word analysis, co-citation analysis, LDA model, topic discovery
PDF Full Text Request
Related items