Font Size: a A A

Design And Implementation Of Online Data Mining Competition System

Posted on:2014-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y L CaiFull Text:PDF
GTID:2268330392962829Subject:Software engineering
Abstract/Summary:PDF Full Text Request
According to data from the United State Internet Data Center, nowadays the quantityof data is growing rapidly: data from internet has an annual growth of50%and doubleevery two years. The growth of data is not only in the internet but also in other many fields,such as scientific research, sports and industries. All of these are producing andaccumulating amount of data. In the past, it was hard to deal with such a quantity of databecause of technological backwardness. However, the development of data mining hasmade it different and help people deal with a huge mass of data. It is obvious that datamining will play an important role in the business competition and scientific research in thefuture.This project is aim to provide a system that students and other learners can learnsomething about data mining conveniently. In practice, it is neither easy to find abenchmark for evaluation, nor for students to find some real data set to analyze. Throughthis system, we can release our data, including data description, along with the competitionand students and other learners can get the data and analyze it by themselves. After that,the participants can submit their own results of their own research and the system willscore it and rank it. In the other hand, it is very expensive to hire an expert in data miningfor the study of a company’s own data, so it will save the company a lot of money byreleasing their own data by hosting a competition. The company can reward the participantfor their excellent work. By this way, the companies solve their problem with less moneyand the participants get rewards which is a win-win cooperation for both companies andparticipants.We build our system by the form of a website. More concretely, we build it on theLinux which is a set of popular operating system. Meanwhile, we use Nginx as HTTPserver and MySQL as our database. We choose PHP for server-side programming languageand in fact most of our code is written by PHP. Based on the LNMP framework, we useAjax to implement the interaction between the client-side and the server-side. In thebackground, we process data submitted by users with Python script. This system providessome general function as common system does, such as user management and forummanagement. The competition management is the most important part of our system. It is mainly about hosting a competition and making a submission to a competition. Participantsmay make a great deal of submission at the same time. For example, the day beforedeadline, participants will likely do this. In case of a great deal of submission made at thesame time, our system will do asynchronous processing to keep our system stable, insteadof dealing with the submission immediately. To achieve this goal, we use a message queueto hold the submission and deal with it later. At last, we will set up multiple processes todeal with the submission placed in the message queue in the same time and it will improvethe performance of the system.
Keywords/Search Tags:Data mining, online judge system, LNMP framework, Ajax
PDF Full Text Request
Related items