Font Size: a A A

Personal identity matching

Posted on:2017-08-20Degree:Ph.DType:Dissertation
University:Florida Institute of TechnologyCandidate:Al-Shuaili, Mazin HamedFull Text:PDF
GTID:1478390014496189Subject:Computer Science
Abstract/Summary:
A name of the person is still the most commonly used attribute to identify an individual, especially in border-control measures, criminal investigations, and intelligence analyses. Personal identity matching through an individual's name, however, is not a trivial task. There are numerous problems associated with name matching, especially when matching takes place across languages, a common occurrence in border control and security investigations. For example, people's names are generally composed of out-of-vocabulary words, which are known to represent significant challenges for cross-language information retrieval.;In this work, we propose to implement and demonstrate two novel algorithms for cross-language personal name matching. The first algorithm uses sound techniques to create a multidimensional vector representation of names to compute a degree of similarity. The algorithm compares names that are written in the same language or in different languages (i.e., cross language). The second algorithm builds on the first and measures the similarities among full names, taking into account the full-name structure. The proposed algorithm solves multiple issues associated with personal name identification, including a) the transliteration of names, since names can be transliterated with a variety of spellings; b) cross-language issues, as personal names are out of vocabulary (00V); and c) the structure of the full name, as the order of single names plays an important role in identifying a person. We evaluate the algorithms for transliterated names and cross languages using Arabic and English as examples. Significant results are achieved by both algorithms compared to other existing algorithms.;In addition to both algorithms, we propose and demonstrate a novel technique to automatically map characters from different languages into English without human interference and without prior knowledge of the language. This technique provides a statistical and a phonetic model that is used by the first algorithm to compare names in different language scripts (cross language). The method also generates Soundex codes for the source language based on English Soundex codes. We implement this technique for five languages: Arabic, Russian, Urdu, Hindi, and Persian. Five Soundex tables are provided as result.
Keywords/Search Tags:Personal, Matching, Name, Language
Related items