Font Size: a A A

Research And Design Of Machinery-Text Acquisition And Classification

Posted on:2013-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:S H WeiFull Text:PDF
GTID:2248330362472840Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the information technology industrycontinuously improves, more and more information is accumulated on the network.Howto find useful information in the vastness of the information has been a hot topic ofresearch in the field. The background of this study is to the needs of specific areas ofconceptual design Automatic Generation which is belong to Shaanxi ProvincialDepartment of Education special research projects. The topics chose the mechanicalfield as the research objectives, and explore how to search information from the mass ofuseful industry information and do the further classification, inorder to meet thedemand of themachinery industry.This study focuses on the two aspects, the theme crawler and text classification.The theme crawler do the first classification which extraces the mechanical text fromthe internet.The text classifier do the second classification which assign themachinery-text to the ten subcategories.The main work as follows:(1) Professional vocabulary is made under the guidance of the mechanical experts,Which contains a total of20000words about the machinery.It prepare the groundfor the decription of the theme crawler, Chinese word segmentation, textdescriptions and textclassification.(2) Design a mechanical theme crawler,select a appropriate topic describes crawlingstrategy to guide the work, download the proper text to the library through thecalculation.After the work of the crawler,the texts in the libraryare belong tomechanical field. (3) Design Naive Bayes classifier.By experimental validation,the classification resultis not satisfactory, and then analyzes resulting in low classification accuracy.(4) Improve the classifier according to the reason.through the introduction of thegray relational grade calculation and improved weight calculation methodimproved Bayesian text classifier is designed(5) Based on the above, complete the design and implementation of informationcollection and classification of machinery...
Keywords/Search Tags:focused crawler, professional vocabulary, text classification, greycorrelation relative degree, bayesian classifier
PDF Full Text Request
Related items