E-commerce is an industry born with data,which will produce a large number of user related data.These data are characterized by high speed and variability,and various attributes of data may change over time,leading to poor decisions.The issue of concept drift is also one of the main issues in data stream mining,which requires the development of mining models that can adapt well to the concept drift of e-commerce data streams.Therefore,this thesis proposes a variable sliding window frequent pattern mining algorithm based on concept drift detection and a double-layer variable sliding window frequent pattern mining algorithm based on concept drift type detection.The work of this thesis mainly includes the following three aspects:(1)This thesis proposes a variable sliding window frequent pattern mining algorithm based on concept drift detection,called VSW-CDD.In response to the fact that fixed sliding windows cannot adapt to the ever-changing nature of data streams,this thesis designs a variable-size window based on the sliding window technology.During the mining process,both the mining result variables and the cause variables of concept drift are detected to determine whether the concept drift occurred in the data stream.When the concept of the data stream remains unchanged,the window expands;but when concept drift occurs,the window shrinks.The experiment shows that the algorithm proposed in this thesis can detect concept drift in the data stream promptly and adapt to new concepts by adjusting the window size.In addition,the algorithm can mine the latest frequent patterns in the data stream,and has good mining performance for user-click datasets in e-commerce websites.Moreover,compared to other algorithms,the algorithm proposed in this thesis also performs better in recall and adaptation.(2)This thesis proposes a nested double-layer variable sliding window frequent pattern mining algorithm based on concept drift type detection,called DLVSW-CDTD.At present,most algorithms for dealing with concept drift focus on a single type of concept drift,making it difficult to adapt to application scenarios with different types of drift data simultaneously.Therefore,this thesis introduces a double-layer nested variable sliding window to distinguish the types of concept drift based on the VSW-CDD algorithm,and combines an attenuation model to adapt to different types of concept drift during the mining process.The experimental results show that the DLVSW-CDTD algorithm can not only detect different types of conceptual drift in data streams,but also can process drift adaptation contrapuntally.It also has specific improvements in time complexity and memory consumption.In addition,the performance of the algorithm in all aspects will not change due to the changes in window size,and the overall stability of the algorithm is perfect.(3)Based on the VSW-CDD algorithm and DLVSW-CDTD algorithm,this thesis designs and implements an e-commerce data mining prototype system based on concept drift detection.The system adopts the Django framework and Vue front-end architecture,and designs four modules:user information management,data file management,frequent pattern mining,and result display.Users can flexibly choose mining models based on actual needs to obtain corresponding frequent pattern results.Through the use and test of the system,it has shown good practicality and stability,while also proving the effectiveness and practicality of the algorithm proposed in this thesis. |