Font Size: a A A

Study Of Web Survey System Basing On Data Mining

Posted on:2008-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ChenFull Text:PDF
GTID:2167360242488936Subject:Statistics
Abstract/Summary:PDF Full Text Request
Among the numerous investigation approaches of sociology, survey is the most popular method that used in collecting data for a certain subject. With the widely use of internet among organizations, colleges and individuals, web survey become more prosper and prevalent than ever before. In comparison with traditional research method, web survey makes the whole process easier. Moreover, the distribution of questionnaire information can be finished in short time, and data will be achieved in low cost and also can be treated directly on computer. The paper mainly discussed the case of data in web survey and the recruitmen algorithm for missing data due to nonresponse, based on which web survey system was accomplished and corresponding demonstration study was carried out.Combined with the theory of spot check, the source and type of data issue in web survey were analyzed, and the issue was also quantified using statistic theory. Consequently, the pretreatment target and its method for data in web survey were primarily discussed. Furthermore, under the guidance of decision tree classifying theory and rough sets theory in data mining, the fill algorithm for missing data basing on ID3 algorithm and the ROUSTIDSA algorithm on the basis of rough sets were studied comprehensively and profoundly. Also, the recruitmen defect for missing data during the course of web survey for both of algorithms was analyzed, and the k-similar matrix recruitmen algorithm (k-SM Algorithm) for missing data basing on Rough sets was brought out accordingly in the paper. The algorithm is an improvement for the ROUSTIDSA algorithm. It takes the missing and recruitmen of decision property into account, and effectively resolves the decision conflict after the recruitmen.On the basis of the above-mentioned analysis and combined with the characteristics of web survey and the demand of investigator for web survey system, an universal web survey system was developed by using advanced designing and exploiting tools. The system (http://www.netsurvey.cn) can totally realize the establishing, managing and releasing of questionnaire, and the description of statistic analysis, and so forth. Additionally, the recruitmen algorithm for missing data basing on ID3 algorithm and k-SM algorithm were principally actualized. Finally, the demonstration research carried out using the Netsurvey system had two steps. The effect of web survey method and traditional method on interviewee was studied in the first step, and also the two methods were put into practice. The result of two methods for data pretreatment in web survey system was studied in the second step. Results show that web survey is superior in nonresponse rate to print survey. On the other hand, for interviewee, there is no difference in two methods when facing different question type (including entry box, single choice subject and multiple-choice subject) and uncongenial subject (behavior and attitude subjects). That is, interviewee can express same idea whichever method was used. Also, analytical results of response for entry box subject indicate that the influence for interviewee is obvious when the style of subject has little variation for both of survey methods. Furthermore, the recruitmen result for missing data basing on ID3 algorithm is inferior to that of k-SM, and the time complexity is in a more high level for the former.The web survey research in China is still in the stage of beginning, and most of them are descriptive and qualitative studies. The paper covers both of qualitative analysis and quantitative demonstration study, and author expect it will play an active role on the application and development of web survey in China.
Keywords/Search Tags:Web Survey, ID3 Algorithm, k-Similar Matrix, Data Pretreatment, Print Survey
PDF Full Text Request
Related items