An organizational memory system for capturing information from unstructured text: The Infoscan and Infoview Systems | Posted on:2000-01-28 | Degree:Ph.D | Type:Dissertation | University:Texas A&M University | Candidate:Petersen, Lawrence C | Full Text:PDF | GTID:1468390014464568 | Subject:Information Science | Abstract/Summary: | PDF Full Text Request | Organizational memory refers to stored information from an organization's history that can be brought to bear on present decisions (Walsh and Ungston, 1991). Information of value to an organizational memory can often be found in unstructured formats such as personal notes, memos, and messages. Information contained in informal documents can be difficult and expensive to capture. A set of knowledge management tools is needed to facilitate the acquisition of useful information from informal documents.; This research addresses the problem of how useful information can be acquired from unstructured text and stored in an accessible form in an efficient and economic manner.; In this research a logical architecture for an organizational memory system (OMS) is proposed. A prototype system is developed to demonstrate the feasibility of the architecture. The prototype system, consisting of two programs called InfoScan and InfoView, is tested on a corpus of 10,000 e-mail messages. In the test the system achieved 87% recall, 87% precision and an overall performance of 87%.; InfoScan analyses a training set of documents to develop a template consisting of key words and phrases. The template is used to locate documents related to that subject. The second program, InfoView, is a database designed to give a user an effective means of viewing the messages selected by InfoScan.; The keyword selection process is based on the concept of cue validity developed in cognitive psychology and used by Goldberg (1996) in text categorization. Cue validity to provides “…a measure of the degree to which a particular feature distinguishes instances of a concept from instances of contrasting concepts.” (Goldberg, 1996).; InfoScan parses words from a training set of documents, cleans, spell checks, tags by speech type, and calculates cue values for each word. Words of certain speech types and with low cue values are eliminated, producing a list of potential key words. A human operator selects those words from the list which are most closely related to the target subject. Key words and phrases are used to evaluate the rest of the documents in the corpus. | Keywords/Search Tags: | Organizational memory, Information, System, Infoscan, Key words, Documents, Text, Infoview | PDF Full Text Request | Related items |
| |
|