Font Size: a A A

Framework and Algorithms for Extraction of Knowledge: Accelerated Radical Innovation and Spatial Interestingness Hotspots

Posted on:2012-04-05Degree:Ph.DType:Dissertation
University:University of HoustonCandidate:Miller, Ruth HuangFull Text:PDF
GTID:1468390011464083Subject:Computer Science
Abstract/Summary:
Data are the basic building block of computing. Extracting knowledge from the abundance of data requires substantial processing. Annotation, mining, and visualization are three transformational processes that convert these data into knowledge. Unstructured, semi-structured, and geo-spatial data has experienced unprecedented growth in volume and on-line availability with the explosion of the Internet. This growth makes it increasingly likely that the precise knowledge the user needs or wants is available somewhere, but makes retrieval, usage, and understanding of these data much more challenging. This dissertation will look at three strategies for transforming data into knowledge. The first strategy is to collect and aggregate data from difference sources into domain specific data warehouse repositories that enables rapid knowledge retrieval and use. This strategy is used when the specific purpose has not been established in advance or the retrieval of this knowledge is time critical. The second strategy is to annotate the retrieved data with XML according to predetermined domain specific ontologies to facilitate querying this knowledge. This strategy is best used for unstructured or semi-structured domain specific documents. The third strategy centers on extracting knowledge from spatially annotated data. In this case, spatial context, particularly location, serves as the glue which ties information together that originates from different knowledge sources. The main contributions of this dissertation are: 1) development of a framework for finding geo-spatial hotspots, 2) development of a geo-feature pre-selection algorithm to automatically search for promising candidates, 3) development of ZIPS, a interestingness hotspot detection algorithm based on polygons, 4) experimental evaluation of the proposed algorithms in case studies involving Internet advertising, housing vacancies, and unemployment, 5) creation of a framework for agent based domain specific data collection supporting the ARI Competitive Intelligence Methodology, and 6) creation of a framework for XML annotation of textual documents based on ontologies for subsequent querying.
Keywords/Search Tags:Framework, Data, Domain specific
Related items