Font Size: a A A

The Key Technology And System Implementation On Grain Information Automatic Acquisition

Posted on:2014-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:R H GengFull Text:PDF
GTID:2268330425958721Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:
The grain is the lifeblood of the country, and it is critical to the mankind. With therapid development of the Internet, more and more open source information on thegrain has been published on the Internet, and the network will gradually become oneof the main sources of the grain information. Facing the intricate grain informationonline, how to obtain the grain information quickly and accurately of the governmentand grain enterprises, and extracting valuable from the vast amounts of intelligencedata to improve the efficiency of grain intelligence analysis, achieving the grainintelligence service, are big problem to be solved. The generation of the Web textmining technology, the study of grain information service system has been injectednew vitality, and therefore to carry out the intelligence analysis of grain on the basisof Web mining technology research is a significant topic.This paper introduces grain information analysis technology in depth, which iscovered with information acquisition technology, information preprocessingtechnology and information extraction technology, and studies the two keytechnologies are of grain information analysis system: the feature selection problemsand the feature weight algorithms.(1) For the feature selection problem, analyzing the shortcoming of the originalchi-square statistic method in depth, and proposing an improved SF-WF-AI algorithm,which is effective to remove the negative entry related to the category, at the sametime, improves low-frequency term weights in the designated category and reducesthe term weights which are widespread in the other categories but rarely appear in thedesignated category. An experiment is carried out and the results show that theimproved method can effectively achieve the better classification results.(2) The pros and cons of the feature weight algorithm has the large impact on theclassification results. Traditional TFIDF algorithm is text set as to a whole, andignores the effect of distribution information among and inside classes on weight.This paper introduces a new improvement idea on degree of polymerization inside aclass and distribution among classes, and then puts forward a new PC-DI term weightalgorithm.After comparison it shows that the method is better than others improved algorithm and the improved algorithm is feasible.Finally, we designed the grain information acquirement prototype system, andgave a logical framework for the grain information analysis. The empirical resultsdemonstrate its base functions.
Keywords/Search Tags:Grain information, Feature selection, Chi-Square statistic, Feature weight, Automatic summarization
Related items