Font Size: a A A

Classifying maritime near-miss and injury report using text mining

Posted on:2013-09-28Degree:M.E.SType:Thesis
University:Lamar University - BeaumontCandidate:Paul, KallulFull Text:PDF
GTID:2458390008965226Subject:Information Technology
Abstract/Summary:
Automatic text classification has become very important due to the high volume of electronic information. Many projects generate volumes of information that cannot be effectively classified by manual text classification process. This research automatically classified a corpus of near-miss incidents that included 17,846 text descriptions of events. This project employed five different classifiers including Decision Tree, Neural Network, Naïve Bayes, Naïve Bayes Kernel, and k-Nearest Neighbor to classify the corpus into three different categories of Location, Detailed Event, and High Level Event. Four different text preprocessing approaches were applied based on the stop words removal, the number of n-grams, the letter case transformation, and the low frequency cut off. The classification accuracy ranged from 62% for detailed event category to 95% for high level event category.
Keywords/Search Tags:Text, Classification, Event
Related items