An organizational memory system for capturing information from unstructured text: The Infoscan and Infoview Systems

Posted on:2000-01-28

Degree:Ph.D

Type:Dissertation

University:Texas A&M University

Candidate:Petersen, Lawrence C

Full Text:PDF

GTID:1468390014464568

Subject:Information Science

Abstract/Summary:

PDF Full Text Request

Organizational memory refers to stored information from an organization's history that can be brought to bear on present decisions (Walsh and Ungston, 1991). Information of value to an organizational memory can often be found in unstructured formats such as personal notes, memos, and messages. Information contained in informal documents can be difficult and expensive to capture. A set of knowledge management tools is needed to facilitate the acquisition of useful information from informal documents.; This research addresses the problem of how useful information can be acquired from unstructured text and stored in an accessible form in an efficient and economic manner.; In this research a logical architecture for an organizational memory system (OMS) is proposed. A prototype system is developed to demonstrate the feasibility of the architecture. The prototype system, consisting of two programs called InfoScan and InfoView, is tested on a corpus of 10,000 e-mail messages. In the test the system achieved 87% recall, 87% precision and an overall performance of 87%.; InfoScan analyses a training set of documents to develop a template consisting of key words and phrases. The template is used to locate documents related to that subject. The second program, InfoView, is a database designed to give a user an effective means of viewing the messages selected by InfoScan.; The keyword selection process is based on the concept of cue validity developed in cognitive psychology and used by Goldberg (1996) in text categorization. Cue validity to provides “…a measure of the degree to which a particular feature distinguishes instances of a concept from instances of contrasting concepts.” (Goldberg, 1996).; InfoScan parses words from a training set of documents, cleans, spell checks, tags by speech type, and calculates cue values for each word. Words of certain speech types and with low cue values are eliminated, producing a list of potential key words. A human operator selects those words from the list which are most closely related to the target subject. Key words and phrases are used to evaluate the rest of the documents in the corpus.

Keywords/Search Tags:

Organizational memory, Information, System, Infoscan, Key words, Documents, Text, Infoview

PDF Full Text Request

Related items

1	The Design And Implementation Of Working Platform Of Payment And Settlement
2	Intranet portal: Organizational memory information system
3	Research On Coverless Text Information Hiding Based On Frequent Words In Text Sets
4	Word-sense disambiguation for large text databases
5	Supporting visual access to a distributed organizational memory warehouse in the web environment
6	Research And Implementation Of A System For Detecting And Identifying Information On Added Print Documents
7	The Research On A Lucene-based Full-text Retrieval Model
8	Research And Design Of A Special Field-based Text Information Obtaining System On Web
9	Design And Implementation Of Text Information Recommendation System Based On Short Text Processing Algorithm Optimization
10	Design And Implementation Of Web Document Extraction And Offline Collection System