Font Size: a A A

Research Of Text Representation Based On VSM Model And Soft Sensor Based On Bayesian Networks

Posted on:2015-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z GuFull Text:PDF
GTID:2268330425984664Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As the rapid development of information technology in modern society, the information capacity increases rapidly and a lot of state information saves in these vast resources. These state information predicts possible changes of the state of society, events and environment, and changes society by clear directivity and purpose, and provides a big convenience to people’s life and work. But the rapid growth of information resources also leads to a contradiction between the rapid expansion of false information and the ability for obtaining reliable information. How to effectively predict the reliability for complicated and changeable state information has become the focus of people’s attention.Firstly, due to the state information is mostly in the form of network texts, they must be transformed into the structure that can be processed by computers. This is the important infrastructure and premise of the reliability prediction for the state information. For the structural transformation of network texts, namely text representation, the TF-IDF weight based on the vector space model in the text classification technology is studied at first. Then for the insufficient performance of text representation of TF-IDF weight algorithm, information gain and text global weight are respectively introduced to add the keyword distribution information among text classes and the text-theme information contained in keywords to the weight, and the efficient text representation will be improved and implemented. On this basis, a network-text representation system based on improved TF-IDF weight algorithm will be built to achieve fast and efficient scale text representation.Secondly, Bayesian network is the ideal model of a data mining and the uncertain knowledge representation, which has the successful application in many aspects such as target recognition, uncertain knowledge reasoning and prediction. At the same time, different from hard sensors processing vectorial physical data, for network texts composed of one-dimensional linear characters and on the basis of its transformed structure, this paper proposes a reliability prediction model based on Bayesian networks for state information, namely a soft sensor based on Bayesian networks which can help people accurately locate text sets of different kind of state information, and can be used for effective reliability prediction and assessment of state information, and can provide reliable state information supports for people. Finally, this paper uses open-source text sets to test the improved TF-IDF feature weighting algorithm, and analyzes the effect of the algorithm. In addition, it is for open-source text sets to analyze the text data distribution established by the soft sensor model based on Bayesian networks and the real text data distribution, and validate the text data distribution’s modeling effect based on the soft sensor model; At the same time, it is proved that the effectiveness of the reliability prediction for state information based on soft sensor through the comparison between soft sensor based on Bayesian networks and statistical analysis. At last texts of a portal network collected by myself will be used to test the practical application of the soft sensor.
Keywords/Search Tags:vector space model, feature vector, Bayesian networks, soft sensor, reliability
PDF Full Text Request
Related items