Font Size: a A A

Implementation of Self-organizing Maps with Pytho

Posted on:2019-07-09Degree:M.SType:Thesis
University:University of Rhode IslandCandidate:Yuan, LiFull Text:PDF
GTID:2478390017989058Subject:Computer Science
Abstract/Summary:
As a member of Artificial Neural Network, Self-Organizing Maps (SOMs) have been well researched since 1980s, and have been implemented in C, Fortran, R [1] and Python [2]. Python is an efficient high-level language which has been widely used in the Machine Learning field for years, but most of the SOM-related packages which written in Python only perform model construction and visualization. However, the POPSOM package, written in R, is capable of performing functionalities beyond the model construction and visualization, such as evaluating the model's quality with statistical methods and plotting marginal probability distribution of the neurons and data with 2-D graphs. In order to benefit the Python user with POPSOM package's advantages, it is significant to migrate POPSOM package to Python-based. This study shows the details about this implementation.;There are three major tasks for the package implementation: 1) Migrate POPSOM package from R-based to Python-based. 2) Refactor the source code from procedural programming paradigm to object-oriented programming paradigm. 3) Improve the package by adding normalization options for model constructing function. In addition to constructing the model in Python, Fortran is also embedded to accelerate the speed of model construction significantly.;Since the final program has been completed, it is necessary to verify the correctness of the program. The best way to achieve that goal is comparing the output of the Python-based program to the output which is generated by R-based program. For model construction function, the SOM algorithm initializes the weight vector of the neurons randomly at very beginning, and then selects vector randomly during the training. So, we could not expect the same input (data set) will result exactly the same output (neurons). Instead, to convince ourselves that the program is working properly, we tweaked the program a little bit, removing the random factors within the algorithm. In that case, the same data set will generate the same neurons. Besides the model construction, model visualization and other functions which take neurons as their input should return the same results. The detail of above verification will be represented in the following chapters.
Keywords/Search Tags:POPSOM package, Model construction, Implementation
Related items