Font Size: a A A

Organizing scientific data sets: Studying similarities and differences in metadata and subject term creation

Posted on:2013-03-01Degree:Ph.DType:Dissertation
University:The University of North Carolina at Chapel HillCandidate:White, Hollie CFull Text:PDF
GTID:1458390008482089Subject:Information Science
Abstract/Summary:
BACKGROUND: According to Salo (2010), the metadata entered into repositories are "disorganized" and metadata schemes underlying repositories are "arcane". This creates a challenging repository environment in regards to personal information management (PIM) and knowledge organization systems (KOSs). This dissertation research is a step towards addressing the need to study information organization of scientific data in more detail.;METHODS: A concurrent triangulation mixed methods approach was used to study the descriptive metadata and subject term application of information professionals and scientists when working with two data sets (the bird data set and the hunting data set). Quantitative and qualitative methods were used in combination during study design, data collection, and analysis.;RESULTS: A total of 27 participants, 11 information professionals and 16 scientists took part in this study. Descriptive metadata results indicate that information professionals were more likely to use standardized metadata schemes. Scientists did not use library-based standards to organize data in their own collections. Nearly all scientists mentioned how central software was to their overall data organization processes. Subject term application results suggest that the Integrated Taxonomic Information System (ITIS) was the best vocabulary for describing scientific names, while Library of Congress Subject Headings (LCSH) was best for describing topical terms. The two groups applied 45 topical terms to the bird data set and 49 topical terms to the hunting data set. Term overlap, meaning the same terms were applied by both groups, was close to 25% for each data set (27% for the bird data set and 24% for the hunting data set). Unique terms, those terms applied by either group were more widely dispersed.;CONCLUSIONS: While there were similarities between the two groups, it is the differences that were the most apparent. Based on this research it is recommended that general repositories use metadata created by information professionals, while domain specific repositories use metadata created by scientists.
Keywords/Search Tags:Data, Subject term, Information professionals, Repositories, Scientists, Scientific
Related items