Font Size: a A A

Research On Extracting Causal Relationships From Biomedical Literature

Posted on:2021-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y B XiaoFull Text:PDF
GTID:2370330614960359Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Causal graphs play an essential role in the determination of causalities and have been applied in many domains including biology and medicine.Traditional causal graph construction methods are usually data-driven and may not deliver the desired accuracy of a graph.Considering the vast number of publications with causality knowledge,extracting causal relations from the literature to help to establish causal graphs becomes possible.And in order to improve the result of extracting causality,this thesis proposes two algorithms for extracting causality from two aspects respectively.1)Basing on rule-based method and unsupervised learning models: this method includes three modules: data preprocessing,syntactic pattern matching and causality determination.In data preprocessing,before extracting and simplifying the sentence,the abstract is crawled based on the attribute name.In syntactic pattern matching module,the algorithm parses the sentence to obtain part-of-speech tags,then obtains triplets based on these tags,and performs syntactic pattern matching.In the determination of causality,four verb seed sets are initialized,and word vectors are constructed for the verbs in the seed set and triples by applying an unsupervised machine learning model.By comparing the similarities between the verbs in each triple and the verbs in each seed set,the limitations of causality are overcome to determine the causality.Compared with Alashri's and Bui's algorithms,the experimental results show that the F-score of the algorithm in this chapter is increased by 8.29% and 5.37%,respectively.2)Basing on recurrent neural network: This method applys a new data processing strategy,which can improve the performance of causal relationship extraction of recurrent neural network in a small corpus.The method includes two modules: data processing and causality extraction.The data processing part includes the four parts of data cleaning work which is consistent with the above chapter,the extraction of key parts,the addition of part-of-speech tags and the replacement of similar words.Compared with the long-short term memory networks using traditional data preprocessing,the experimental results show that the algorithm in this chapter is superior to the traditional algorithm.These two relationship extraction algorithms proposed in this paper can accurately extract the causal relationship which hidden in the text,and can be used to help construct a causal graph.The causal graph that is obtained based on causal knowledge can serve as the foundation,and data-based methods can be used to validate and supplement the causalities.
Keywords/Search Tags:causality, literature analysis, word vector, recurrent neural network, relationship extraction
PDF Full Text Request
Related items