IJMTES – VIRTUOUS KNACK FOR MINING HIGH UTILITY ITEM SET VIA FP GROWTH ALGORITHM

Journal Title : International Journal of Modern Trends in Engineering and Science

Author’s Name : R Nandhini  unnamed

Volume 03 Issue 06 2016

ISSN no:  2348-3121

Page no: 112-115

Abstract – In data mining the high utility item sets is an emerging topic, which refers to discovering all item sets having a utility meeting a user-specified minimum utility threshold. But the setting of minimum utility threshold value is a tedious process. Difficulty arises in discovering all item sets having a utility, meeting a user specified minimum utility threshold. In order to overcome these problems the proposed method utilizes frequent item sets mining which finds sets of items that frequently appear together in a database. A variety of algorithms have been proposed for mining frequent item sets. The proposed method implements PFP growth algorithm, which performs pre-processing to improve the utility and privacy trade-off and novel splitting algorithm, to support transformation in the database. To improve the utility-privacy tradeoffs, here implemented that long transactions should be split rather than truncated. That is, the database gets transformed by dividing long transactions into multiple subsets each of which meets the maximal length constraint. This method supports high time efficiency, high data utility and degree of privacy.

KeywordsUtility Mining, Data Mining, High Utility Item Set, Closed High Utility Item Set, Minimum Utility Threshold, FP tree 

Reference

  1. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proc. 20th Int. Conf. Very Large Data Bases, 1994, pp. 487–499.
  2. C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, “Efficient tree structures for high utility pattern mining in incremental databases,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708– 1721, Dec. 2009.
  3. J.-F. Boulicaut, A. Bykowski, and C. Rigotti, “Free-sets: A condensed representation of Boolean data for the approximation of frequency queries,” Data Mining Knowl. Discovery, vol. 7, no. 1, pp. 5–22, 2003.
  4. T. Calders and B. Goethals, “Mining all non-derivable frequent itemsets,” in Proc. Int. Conf. Eur. Conf. Principles Data Mining Knowl. Discovery, 2002, pp. 74–85
  5. K. Chuang, J. Huang, and M. Chen, “Mining top-k frequent patterns in the presence of the memory constraint,” VLDB J., vol. 17, pp. 1321–1344, 2008.
  6. R. Chan, Q. Yang, and Y. Shen, “Mining high utility itemsets,” in Proc. IEEE Int. Conf. Data Min., 2003, pp. 19–26.
  7. A. Erwin, R. P. Gopalan, and N. R. Achuthan, “Efficient mining of high utility itemsets from large datasets,” in Proc. Int. Conf. Pacific- Asia Conf. Knowl. Discovery Data Mining, 2008, pp. 554–561. 738 IEEE Transactions On Knowledge And Data Engineering, Vol. 27, No. 3, March 2015
Scroll Up