EFFICIENT SUPPORT COUNTING OF CANDIDATE ITEMSETS FOR ASSOCIATION RULE MINING

Li-Xuan Lin, Don-Lin Yang, Chia-Han Yang, Jungpin Wu

2008

Abstract

Association rule mining has gathered great attention in recent years due to its broad applications. Some influential algorithms have been developed in two categories: (1) candidate-generation-and-test approach such as Apriori, (2) pattern-growth approach such as FP-growth. However, they all suffer from the problems of multiple database scans and setting minimum support threshold to prune infrequent candidates for process efficiency. Reading the database multiple times is a critical problem for distributed data mining. Although more new methods are proposed, like the FSE algorithm that still has the problem of taking too much space. We propose an efficient approach by using a transformation method to perform support count of candidate itemsets. We record all the itemsets which appear at least one time in the transaction database. Thus users do not need to determine the minimum support in advance. Our approach can reach the same goal as the FSE algorithm does with better space utilization. The experiments show that our approach is effective and efficient on various datasets.

References

  1. Agrawal, R. & Srikant, R., 1994. Fast Algorithms for Mining Association Rules. Proc. of the 20th Intl. Conf. on Very Large Data Bases, 487-499.
  2. Agrawal, R., & Shafer, J. C., 1996. Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering, 8, 6, 962-969.
  3. Chu, T. P., Wu, F., & Chiang, S. W., 2005. Mining Frequent Pattern Using Item-Transformation Method. Fourth Annual ACIS Intl. Conf. on Computer and Information Science, 698-706.
  4. Cristoforr, L., 2008. ARtool Project. URL: http://www.cs.umb.edu/laur/ARtool/.
  5. Dunkel, B., & Soparkar, N., 1999. Data Organization and Access for Efficient Data Mining. Proc. of the 15th Intl. Conf. on Data Engineering. 522-529.
  6. Han, J. W., Pei, J., & Yin, Y. W. 2000. Mining frequent patterns without candidate generation. SIGMOD Record, 29, 1-12.
  7. Hipp, J., Güntzer, U., Nakhaeizadeh, G., 2000. Algorithms for Association Rule Mining - A General Survey and Comparison. SIGKDD Explorations, 2, 1, 58-64.
  8. IBM Almaden Research Center, 2006. Synthetic Data Generator.URL:http://www.almaden.ibm.com/softwar e/quest/
  9. Lin, H. W., Yang, D. L., Liao, W. C., & Wu, J., 2007. Efficient Support Counting of Candidate Itemsets for Association Rule Mining. Proc. of the 2nd Intl. Workshop on Chance Discovery and Data Mining, 190-196.
  10. Pei, J., Han, J. W., Mortazavi-Asl, B., Wang, J. Y., Pinto, H., Chen, Q. M., Dayal, U., & Hsu, M. C., 2004. Mining sequential patterns by pattern-growth: The PrefixSpan approach. IEEE Transactions on Knowledge and Data Engineering, 16, 1424-1440.
  11. Srikant, R., & Agrawal, R., 1996. Mining Sequential Patterns: Generalization and Performance Improvements. In Proc. of EDBT'96, 3-17.
  12. Wang, J. Y., Han, J. W., Lu, Y., & Tzvetkov, P., 2005. TFP: An efficient algorithm for mining top-K frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering, 17, 652-664.
Download


Paper Citation


in Harvard Style

Lin L., Yang D., Yang C. and Wu J. (2008). EFFICIENT SUPPORT COUNTING OF CANDIDATE ITEMSETS FOR ASSOCIATION RULE MINING . In Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT, ISBN 978-989-8111-53-1, pages 180-185. DOI: 10.5220/0001888101800185


in Bibtex Style

@conference{icsoft08,
author={Li-Xuan Lin and Don-Lin Yang and Chia-Han Yang and Jungpin Wu},
title={EFFICIENT SUPPORT COUNTING OF CANDIDATE ITEMSETS FOR ASSOCIATION RULE MINING},
booktitle={Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,},
year={2008},
pages={180-185},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001888101800185},
isbn={978-989-8111-53-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,
TI - EFFICIENT SUPPORT COUNTING OF CANDIDATE ITEMSETS FOR ASSOCIATION RULE MINING
SN - 978-989-8111-53-1
AU - Lin L.
AU - Yang D.
AU - Yang C.
AU - Wu J.
PY - 2008
SP - 180
EP - 185
DO - 10.5220/0001888101800185