| The rapid development of computer technology has not only promoted the progress of the society and the economic development, but also changed the way people live and work. However, it also provides new crime space and crime means for network criminals. So we need to make full use of technology and law to curb the network crimes. Log files can record the operation processes of operation system, application programs and users, and save the intruders’behaviors. Therefore they have become very important evidence sources in the computer forensics. How to find out the existing correlation and further reconstruct the intrusion scenes from a large amount of easily damaged log files is the main problem to be solved in this thesis.Firstly, a redundant data cleaning technology based on EventID classification model was proposed, which uses the different format standards according to the different log types, according to the difficulties caused by the huge volume of log records and the different attributes of different logs. Baed on EventID classification model and formatting process, this method can reduce the amount of data.Secondly, the attributes value ranges are too big, which makes the FPTree space complexity too high and reduces the processing efficiency. On the other hand, different attributes have the same attribute value, which leads to the ambiguous meanings of asocciation rules. The paper proposed a log mining algorithm PFP_Growth based on FP_Growth according to the matters. According to the PFP_Growth algorithm, before establishing frequent pattern tree, the attributes were preprocessed in order to make the meaning of the attributes clearly. Then the frequent pattern tree was built up according to the constraint attributes group set demand oriented.Thirdly, a formatting rules matching method based on simulation attack was proposed. First of all, formatting rules were generated by simulating attacks. Then, matching rules was implemented by matching the attribute’s prefix with the formatted rules.Finally, a scene reconstruction method based on the rule database and attributes tracking was proposed. Corresponding intrusion steps were analyzed through the rules matching and the invasion scene can be reconstructed by attributes correlation combined with the registry changes.In the experimental part, the efficiency of FP_Growth and PFP_Growth was compared under different log records number and different attribute constraints number. The experimental results show that the PFP_Growth algorithm is effective for large volume of data mining and attribute related mining. The monitoring program has little influence to the performance of the system. |