Classification Model using Contrast Patterns

Hiroyuki Morita, Mao Nishiguchi

2013

Abstract

A frequent pattern that occurs in a database can be an interesting explanatory variable. For instance, in market basket analysis, a frequent pattern is used as an association rule for historical purchasing data. Moreover, specific frequent patterns as emerging patterns and contrast patterns are a promising way to estimate classes in a classification problem. A classification model using the emerging patterns, Classification by Aggregating Emerging Patterns(CAEP) has been proposed (Dong et al., 1999) and several applications have been reported. It is a simple and effective method, but for some practical data, it can be computationally costs to enumerate large emerging patterns or may cause unpredicted cases. We think that there are two major reasons for this. One is emerging patterns, which are powerful when constructing a predictive model; however, they are not able to cover frequent transactions. Because of this, some of the transactions are not estimated, and the accuracy of the estimation becomes poor. Another reason is the normalization method. In CAEP, scores for each class are normalized by dividing by the median. It is a simple method, but the score distribution is sometimes biased. Instead, we propose the use of the z − score for normalization. In this paper, we propose a new, CAEP-based classification model, Classification by Aggregating Contrast Patterns (CACP). The main idea is to use contrast patterns instead of emerging patterns and to improve the normalizing method. Our computational experiments show that our method, CACP, performs better than the existing CAEP method on real data.

References

  1. Bay, S. D. and Pazzani, M. J. (1999). Detecting change in categorical data: Mining contrast sets. In In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pages 302-306. ACM Press.
  2. Dong, G., Zhang, X., Wong, L., and Li, J. (1999). Caep: Classification by aggregating emerging patterns. In Arikawa, S. and Furukawa, K., editors, Discovery Science, volume 1721 of Lecture Notes in Computer Science, pages 30-42. Springer Berlin Heidelberg.
  3. Morita, H. and Hamuro, Y. (2013). A classification model using emerging patterns incorporating item taxonomy. In Gaol, F. L., editor, Recent Progress in Data Engineering and Internet Technology, volume 156 of Lecture Notes in Electrical Engineering, pages 187-192. Springer Berlin Heidelberg.
  4. Takizawa, A., Koo, W., and Katoh, N. (2010). Discovering distinctive spatial patterns of snatch theft in kyoto city with caep. Journal of Asian Architecture and Building Engineering, 9(1):103-110.
  5. Uno, T., Asai, T., Uchida, Y., and Arimura, H. (2003). Lcm: An efficient algorithm for enumerating frequent closed item sets. In In Proceedings of Workshop on Frequent itemset Mining Implementations (FIMI03).
Download


Paper Citation


in Harvard Style

Morita H. and Nishiguchi M. (2013). Classification Model using Contrast Patterns . In Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8565-59-4, pages 334-339. DOI: 10.5220/0004569403340339


in Bibtex Style

@conference{iceis13,
author={Hiroyuki Morita and Mao Nishiguchi},
title={Classification Model using Contrast Patterns},
booktitle={Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2013},
pages={334-339},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004569403340339},
isbn={978-989-8565-59-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Classification Model using Contrast Patterns
SN - 978-989-8565-59-4
AU - Morita H.
AU - Nishiguchi M.
PY - 2013
SP - 334
EP - 339
DO - 10.5220/0004569403340339