Font Size: a A A

An Arabic lexicon to support information retrieval, parsing, and text generation

Posted on:1997-12-12Degree:Ph.DType:Dissertation
University:Illinois Institute of TechnologyCandidate:Alsamara, Khalid SaidFull Text:PDF
GTID:1468390014480520Subject:Computer Science
Abstract/Summary:
We developed an Arabic lexical database to support information retrieval, text generation, and parsing. It contains information about 12,500 words in the computer sublanguage. The database has a main table containing all words and then separate tables for nouns, adjectives, verbs, and particles.; The main table contains basic information for each Arabic word in a corpus of 242 abstracts, part of speech (noun, verb, particle, adjective), gender (masculine or feminine), number (singular, dual, plural), person (1{dollar}sp{lcub}rm st{rcub}{dollar}, 2{dollar}sp{lcub}rm nd{rcub}{dollar}, 3{dollar}sp{lcub}rm rd{rcub}{dollar}).; The lexical entry for the noun contains gender (masc. or fem.), person (1st, 2nd, 3rd), number (singular, dual, plural). It also places the noun in a number of syntactic/semantic categories; inert or derived, concrete or abstract, structured or declined, denuded or augmented, animate or inanimate and human or nonhuman.; The lexical entry for the verb tells whether it is complete or deficient, transitive or intransitive, denuded or augmented, sound or defective or mixed, and imperfect, or perfect, imperative.; The lexical entry for each particle tells whether it acts on nouns or verbs or both. Particles that are active on nouns are classified as letters of reduction or annulment or vocatives or letters of exclusion or conjunctions. Particles that are active on verbs are specified as different kinds of letters of elusion or opening. Particles that are active on both nouns and verbs are particles of attraction.; The lexical entry for the adjective tells whether they are animate/nonanimate, for animate adjectives record (human/nonhuman).; We implemented a lexical database system on Arabic Windows using the Microsoft Access DBMS and a Graphical User Interface (GUI). It runs on IBM/PC's and its compatibles. It is designed to be used by both programs and human endusers, with the goal of supporting natural language processing systems, ongoing research at the Arabic Language Processing Laboratory at Illinois Institute of Technology and future research in the Arabic language.
Keywords/Search Tags:Arabic, Information, Particles that are active
Related items