Font Size: a A A

Cancer Cure Decision Support System Based On Lda Topic Modelling

Posted on:2017-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:Samuel Kipchumba LeboFull Text:PDF
GTID:2348330566456135Subject:MSc.Software Engineering
Abstract/Summary:PDF Full Text Request
Due to advancement in the use of the internet by cancer patients worldwide to share their medical experiences and survivorship stories,there is a need to extract knowledge from the stories using text mining techniques.Patient self-authored blogs provide a firsthand experience that can inspire and motivate new victims in seeking treatment.Every story is profiled with treatments and cancer type.This work presents a Gazetteer NER and topic modeling using LDA with controlled vocabulary.The Gazetteer uses UMLS Meta-thesaurus,local dictionary,POS and local filters to recognize entities from text.Using Key phrase Extraction Algorithm(KEA),these entities are used for annotating documents in the corpus and training KEA model which is used for generating keywords for all documents.Topics are modeled using LDA with Bag of Keywords(BOK),Topics generated by vocabulary controlled LDA shows clear topic separation.Using a story being read by the patient as the actual query,similarity algorithm is used to calculate similarity of stories and their similarity index ranked to establish most similar stories,NER is tested on biotext corpus and it performs with a precision of 88.2%,Recall of 92.1%,F-Measure of 90.1%.The system performance based on similar stories extraction was evaluated on a survey carried out by real users of the system.94% were voted similar,3.2% not similar and 2.8% not sure...
Keywords/Search Tags:Cancer survivors, Stories, LDA, KEA, NER
PDF Full Text Request
Related items