Font Size: a A A

Application Research On Web Text Mining Based-on XML

Posted on:2010-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:H W MaFull Text:PDF
GTID:2178360275477665Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With Internet development,the information on WWW increases fast.WWW provides massive information for people,but also causes us to fall into a contradiction which is on the one hand,people need to acquire information from WWW fast and effectively,on the other hand,the information on WWW is so huge,structure of theinformation is complicate,and there are many difficulties for dealing with the information.In order to resolve this contradiction,the Web mining technology provides a way,at present the research of Web mining is in developing stage,and needs much research in theory,implementation method and technology.The Web mining technology is an application of tradition data mining technology under the Web environment,which is discovering pattern of implication,unknown,having latent application value,uncommon from massive Web document set and Web information of users browsing.The subject is studied in this disssertation.The main context is as follows:This disssertation analyzes the existing basic concepts and methods and techniques of data mining and web text mining and XML. Through researching the processing of semi-structured data and its key techniques(e.g. data extract method,transforming arithmetic and data mining method etc),bring forward a applied scheme of XML-based Web text mining.Firstly,through extracting and cleaning and format-transforming the interested semi-structured data on WWW ,gained the effective XML format data.And further more,use SQL SERVER 2005 Integration Services(SQL SERVER Integration Services,"SSIS")and Analysis Services(SQL SERVER Analysis Services,"SSAS") in order to implement data transforming and loading and mining.In the end,design and develop the compact GUI by Visual Studio.NET and DMX(Data Mining extension) in order to browse mining results by user.The dissertation's thought has been come true though designing a XML-based plastic market information collection and analysis system. The prototype system predicts the future trend of plastic materials price according as its history data collecting from Web, offers a applied Web mining scheme for extracting and analyzing and mining the interested data on Web.
Keywords/Search Tags:XML, Data mining, Web mining, Web text mining, SQL SERVER
PDF Full Text Request
Related items