| Data used for data analysis and data mining may contain hundreds of attributes, most of which are not related to mining tasks. While experts in the field can pick the useful attributes, but this may be a difficult and time-consuming task, especially when we do not know the acts of data. Omission of relevant attributes will lead to poor quality of the model. In addition, irrelevant or redundant attributes may increase volume of data and slow down the mining process. How to select the attributes which is relevant to the task of knowledge discovery, thereby improving the efficiency of knowledge discovery, and complete the focus problem is the main content of this article.This paper use doman knowledge and attribute reduction algorithm based on rough set theory to support knowledge discovery process and complete the focus problems of KDD, with the main content and innovation are as follows:1. Research the relationship between domain knowledge and knowledge discovery, the role of domain knowledge in various stage of data mining, the classification,show,storage and usin algorithm of domain knowledge which used to support knowledge discovery process and complete the focus problems of KDD;2. In order to improve the accuracy of classification, this paper researched the discrete algorithm which based on rough set theory and conditions entropy;3. Give the improved algorithm of the general identified matrix algorithm, the new algorithm removed the repeated and non-functional elements in the identified matrix and reduced the time complexity and improved the operating efficiency of the algorithm;4. Research several inspired algorithm,which based on the importance of attributes,focus on the reduction algorithm based on information entropy;give the domain knowledge attributes reduction algorithm,the new algorithm join the user's preference and facilitate the interaction with the domain experts;5. Based on the former reduction algorithm, built the knowledge discovery focus system, the system is used in aluminium electrolysis production data analysis, have achieved good results. |