Font Size: a A A

Design And Development On Classification Training Corpus Management System

Posted on:2013-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:R XieFull Text:PDF
GTID:2248330395974225Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet and contains information and content, includingstructured, semi-structured and unstructured information. Therefore, in order to dealwith the problems and challenges brought about by the explosion of information on awide range of text information extraction, and an urgent need for some technology tohelp people quickly find the information you really need vast amounts of information.How to use the computer for the contents of the text to automatically determine theautomatic classification has become many people’s research.The classification training data management system is the learning managementsystem web crawler technology-based classification of corpus. The system includes theestablishment and maintenance of back-end database front-end application development,the system provides a landing management, classification management, add the corpusand other functional modules. Both back-end database to ensure the consistency,completeness, security, and also to ensure ease of use of front-end application, completeand friendly interface.The system based on the J2EE development framework. NET developmentplatform is a set of software components used to build Web server applications andWindows desktop applications. The system uses a three-tier architecture, the databaseuses Microsoft MySQL, programming language is C#and ASP、Java. B/S structure,improve the system maintenance, and also speeds development. In addition, the systemwas designed with the open source Web crawler Heritrix connection interface, to receivefrom the network "crawl" down the corpus files, re-entry system for processing.In this thesis highlights the development of the corpus management module. Tocomplete corpus classified add, delete, and modify; the corpus to add, modify and delete;ontology model file upload; thesaurus upload; completion of the acquisition moduleinterface development with automatic corpus. Master ASPJava Web programmingtechniques. To master the use of basic database design and MySQL. System architecturebased on B/S architecture, MVC three-tier development technology, corpusmanagement system development, the acquisition module interface based on Web crawler technology automatically corpus development, interface development of themodule with the body of the page to extract.`...
Keywords/Search Tags:Automatic classification of classified training corpus management, Webcrawler, SVM classification of MVC architecture
PDF Full Text Request
Related items