Font Size: a A A

A study of data informatics: Data analysis and knowledge discovery via a novel data mining algorithm

Posted on:2015-10-20Degree:Ph.DType:Dissertation
University:The University of MississippiCandidate:Balan, ShilpaFull Text:PDF
GTID:1478390020951130Subject:Business Administration
Abstract/Summary:
Frequent Pattern Mining (FPM) has become extremely popular among data mining researchers because it provides interesting and valuable patterns from large datasets. The decreasing cost of storage devices and the increasing availability of processing power make it possible for researchers to build and analyze gigantic datasets in various scientific and business domains. A filtering process is needed, however, to generate patterns that are relevant. This dissertation contributes to addressing this need. An experimental system named FPMIES (Frequent Pattern Mining Information Extraction System) was built to extract information from electronic documents automatically. Collocation analysis was used to analyze the relationship of words. Template mining was used to build the experimental system which is the foundation of FPMIES. With the rising need for improved environmental performance, a dataset based on green supply chain practices of three companies was used to test FPMIES. The new system was also tested by users resulting in a recall of 83.4%. The new algorithm's combination of semantic relationships with template mining significantly improves the recall of FPMIES. The study's results also show that FPMIES is much more efficient than manually trying to extract information. Finally, the performance of the FPMIES system was compared with the most popular FPM algorithm, Apriori, yielding a significantly improved recall and precision for FPMIES (76.7% and 74.6% respectively) compared to that of Apriori (30% recall and 24.6% precision).
Keywords/Search Tags:Mining, FPMIES, Data, Recall
Related items