Font Size: a A A

Data Cleansing Techniques In Text Mining Applications

Posted on:2009-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2208360245979538Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present, the research for the Web text excavation is to use the correlation characteristic value to carry on screening in the correlation domain information,but it can't distinguish the information which is wrong obviously in the partial information or which isnot related with the researcher obviously.In this paper we mainly use the data clean technology to solves these questions about the dirty data in the text excavation. The full text is divided into four parts.1st part: introduction.We introduce the research situations and the questions about the texts excavation and the datacleaning in the world.2nd part: the theory research of the datacleaning and text excavation.First we introduce the data quality's concept and classification,then introduce the elementary knowledge and some related technology standards about the text excavation briefly which established the foundation for the following parts.3rd part: the method research of the datacleaning and text excavation.Baseing on the research of the structure-data cleaning ,we propose the datacleaning method aiming at the half structure -data and clean the similar and the repetitional data in customer database based on Web.4th part:the conclusion and the forecast. We summarize the work and make some forecasts for the later research.
Keywords/Search Tags:text excavation, data cleaning, data mining
PDF Full Text Request
Related items