| A significant challenge for the development of modern society and businesses is how to extract valuable information from the vast amounts of data produced by the rapid development of information technology,human production,life,and learning,among other fields.One of the hottest areas of data mining research is high utility itemset mining.It aims to discover the patterns with high importance in the transaction database and mine results that satisfy the user’s needs using the expected utility measure.This allows the combination of mined patterns to be more targeted.This work performs extensive research on the negative utility value and the pertinent quantitative information of the high utility itemsets mining algorithm in light of the aforementioned issues.The main contents of the research are as follows:(1)Most of the previous high utility itemsets mining algorithms only consider the positive utility of items in the transaction database.However,with the in-depth application research of practical problems,there are cases of negative utility values in life.If items with negative utility values appear in the database,the mining algorithm will incorrectly prune the high utility itemsets so that only an incomplete set of high utility itemsets can be found.IHUMN was proposed to address the issue of negative utility values in transaction databases.The algorithm makes full use of memory resources through the improved utility list buffer structure and early filtering strategy.It avoids the memory resources occupied by the low utility item set to construct the utility list.At the same time,A transitive extension pruning method with a negative utility value is suggested by the algorithm.It determines the utility of the itemsets and transitive extensions itemsets to be added together and compared with the minimum utility threshold to eliminate itemsets with low utility,thus avoiding the procedure to misjudge the low utility itemsets as the high utility itemsets.In addition,the algorithm compresses the initial list using the utility-list coverage strategy,which reduces the memory resource consumption.Experiments show that the proposed algorithm can prune a large number of low-utility itemsets,and has good time performance,memory consumption and number of visited nodes.(2)While the high utility itemsets mining algorithm can mine useful data patterns,it does not provide relevant quantitative information.The quantitative information of itemsets can provide users with more accurate decision-making basis and produce more commercial value in real-life applications.Aiming at the problem that itemsets mining does not provide the quantitative information of high utility itemsets.HUQI-LC is proposed.By introducing the concept of length constraint,the algorithm limits the length of the itemsets,filters out the ineligible itemsets,and mines the high-utility quantitative itemsets that are more in line with the user ’s needs.The algorithm uses the revised transaction weighted utility and revised remaining utility to calculate the upper limit of the utility of the itemsets,and a more stringent upper bound on the utility is used to prune the itemsets whose RTWU value is less than the minimum utility threshold.The algorithm’s search space is shrunk using the revised remaining utility,which increases the algorithm’s effectiveness.In addition,the algorithm uses a new utility-list structure to store and read itemsets information,which improves the data processing and calculation of the algorithm.Experiments on various datasets demonstrate that the HUQI-LC algorithm’s pruning method may successfully increase the algorithm’s time performance and decrease its memory consumption. |