With the rapid development of modern science and technology, how to discover the useful knowledge contained in the ocean of information in time has becoming the urgent issues to tackle for each of us. Frequent sequential pattern mining technology aimed at discovering sequences which are appeared at a high frequency in sequential data (time sequential data or spatial sequential data and so on), and treating these sequences as new knowledge patterns. Although the frequent sequential pattern mining technology has becoming an effective way of knowledge discovery, however, release the frequent sequences directly may cause data privacy leak in its process. To solve the above problems, we propose a differentially private frequent sequence mining (FSM) algorithm——PFS (differentially Private Frequent.Sequences mining algorithm). In PFS, to address the problems brought by the may existed long records in databases, we design three useful coping strategies:database sampling, transaction length reduction, and threshold decrease. Through the application of these three strategies, the PFS algorithm effectively can control the amount of noisy required by differentially privacy, and thus provide high data utility and data privacy simultaneously. Experimental results illustrate that the PFS algorithm substantially outperforms the state-of-the-art techniques. Meanwhile, to demonstrate the transaction truncating strategy raised in the record length limitationmethod can be widely used, we apply the transaction truncatingmechanism to the mining of frequent itemsets, and the proposed differentially private frequent itemset algorithm DAT (Differentially private algorithm Apriori based of transaction Truncating) also has a good performance. |