Font Size: a A A

Research On Biomedical Hot Topic Analysis And Burst Detection Model

Posted on:2018-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:H B ZhangFull Text:PDF
GTID:2370330518983065Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of biomedicine,jagged literature:in biomedical field is growing disorderly.Therefore,how to identify hot topics timely and accurately and how to research the evolution and development trend of topics according to the bursty of topics are currently urgent problems in biomedical field.This thesis mainly studies the hot topic analysis and burst detection in biomedical field.The data structure of biomedical literature is complex and often integrates different types of information,such as genome,proteome and clinical research dat.These problems put forward higher requirements for data preprocessing and analysis.Therefore,this thesis maps the subject content of literature into MeSH terms and put forward a kind of data preprocessing process based on Knowledge Organization System.Traditional hot topic analysis methods only focus on identifying hot topics in term frequency level,which ignores the life-cycle evolution of topics.Therefore,this thesis proposes hot topic analysis model based on the life-cycle theory.In view of limitations of the TF-IDF algorithm,this thesis proposes to use the improved TF-PDF value to measure frequency of hot concepts.Furthermore,the hot topic analysis method uses life-cycle algorithm to model the creation,growth,maturity and extinction of topics.Life-cycle algorithm consists of four steps:obtaining energy value of concepts,transferring energy into life-support value,carrying out energy attenuation,obtaining life-support variant of concepts.The TF-PDF value and the life-support variant codetermine the hot value of concepts.Finally,we calculate the similarity of hot concepts,and use k-means to cluster hot concepts into the hot topics.For the burst diversity of biomedical data,this thesis proposes K-states burst detection model.The model assumes that the frequency of topic words in the data stream is randomly generated by a K-states automaton for each time-slice.The building process of the model consists of building concepts-document vectors,generating K-states automaton,defining state transferring cost,analyzing burst feature and obtaining burst topics.
Keywords/Search Tags:Knowledge Organization System, hot topic analysis, burst detection
PDF Full Text Request
Related items