Font Size: a A A

Scientific research impact and data mining applications in hydrogeology

Posted on:2005-06-20Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Fang, Yao-chuen (Y. C.)Full Text:PDF
GTID:1458390008480513Subject:Environmental Sciences
Abstract/Summary:
This dissertation focuses on the use of citation data to evaluate the impactfulness of research in hydrogeology. This study not only explores research impact, but also applies one of the most useful information technologies: data mining techniques on textual data and a practical hydrogeological problem.; Following the Schwartz, Fang and Ibaraki (2002) paper in Ground Water, I examined the citation data from ISI in order to check the stability of the bibliometric data and validation of use of this information. I looked at the citation growth patterns of highly-cited papers from the 80s and used that pattern to predict the citation growth for the highly-cited papers in the next decade. This exercise ensures me the use of citation data and gives us an overview of evolution of science in hydrogeology.; "Innovation" of the research is another important key to create its impact besides research topics. Water Resources Research papers from 1991 are selected to compare with papers before and follow-on. The most highly cited papers in 1991 appear to be unique in that there are relatively few papers like them that were published previously. Moreover, these papers were sufficiently influential to produce a relatively large number of similar follow-on papers.; However, the citation pattern of some classic papers shows that the activities and impact of follow-on papers gradually decline with time. The results of this study reinforce the importance of being a pioneer in a research strand, strategically shifting research strands, adopting strategies that can facilitate really major research shifts.; Applications of data mining techniques on two types of data show the advantage of information technology. I evaluated two general strategies and several variants thereof on the one type of database: textual data. The first strategy is based on Naive Bayes, a popular text classification algorithm. The second strategy is based on Principle Direction Divisive Partitioning, an unsupervised document clustering algorithm. While the performance of both approaches is quite good, some of the new variants that I examined including one, which involves a combination of these two approaches yield even better results.; The other type of database is digital photo images. Statistics information (texture) of digital images (in grayscale) and spatial information along with measured hydraulic conductivities for some area in the outcrop are important attributes in the database. Self Organizing Maps (SOM) clustering with these attributes is applied to cluster small images extracted from the outcrop along with 122 sampling points and successfully predict the hydraulic conductivities for the whole section of the outcrop.
Keywords/Search Tags:Data, Impact, Papers
Related items