# Similarity Assessment as a Dual Process Model of Counting and Measuring

### Bert Klauninger, Horst Eidenberger

#### Abstract

Based on recent findings from the field of human similarity perception, we propose a dual process model (DPM) of taxonomic and thematic similarity assessment which can be utilised in machine learning applications. Taxonomic reasoning is related to predicate based measures (counting) whereas thematic reasoning is mostly associated with metric distances (measuring). We suggest a procedure that combines both processes into a single similarity kernel. For each feature dimension of the observational data, an optimal measure is selected by a Greedy algorithm: A set of possible measures is tested, and the one that leads to improved classification performance of the whole model is denoted. These measures are combined into a single SVM kernel by means of generalisation (converting distances into similarities) and quantisation (applying predicate based measures to interval scale data). We then demonstrate how to apply our model to a classification problem of MPEG-7 features from a test set of images. Evaluation shows that the performance of the DPM kernel is superior to those of the standard SVM kernels. This supports our theory that the DPM comes closer to human similarity judgment than any singular measure, and it motivates our suggestion to employ the DPM not only in image retrieval but also in related tasks.

#### References

- Chater, N. and Vitnyi, P. M. (2003). The generalized universal law of generalization. Journal of Mathematical Psychology, 47:346-369.
- Cormen, T. H., Leiserson, C. S., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. Massachusets Institute of Technology.
- Eidenberger, H. (2003). Distance measures for MPEG7-based retrieval. In Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR 7803, pages 130-137, New York, NY, USA. ACM.
- Eidenberger, H. (2012). Handbook of Multimedia Information Retrieval. atpress.
- Faloutsos, C. and Lin, K.-I. (1995). Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, SIGMOD 7895, pages 163-174, New York, NY, USA. ACM.
- Milgram, J., Cheriet, M., and Sabourin, R. (2006). ”One Against One” or ”One Against All”: Which One is Better for Handwriting Recognition with SVMs? In Lorette, G., editor, Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule (France). Université de Rennes 1, Suvisoft. http://www.suvisoft.com.
- Salton, G., Wong, A., and Yang, C. S. (1975). A vector space model for automatic indexing. Commun. ACM, 18(11):613-620.
- Santini, S. and Jain, R. (1999). Similarity measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21/9:871-883.
- Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237/4820:1317-1323.
- Tenenbaum, J. B. and Griffiths, T. L. (2001). Generalization, similarity, and bayesian interference. Behavioral an, 24:629-640.
- Torgerson, W. S. (1952). Multidimensional scaling: I. theory and method. Psychometrika, 17/4:401-419.
- Tversky, A. (1977). Features of similarity. Psychological Review, 84 (4):327-352.
- Wisniewski, E. J. and Bassok, M. (1999). What makes a man similar to a tie? stimulus compatibility with comparison and integration. Cognitive Psychology, 39(34):208 - 238.
- Yang, Y. and Liu, X. (1999). A re-examination of text categorization methods. In Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 7899, pages 42-49, New York, NY, USA. ACM.

#### Paper Citation

#### in Harvard Style

Klauninger B. and Eidenberger H. (2016). **Similarity Assessment as a Dual Process Model of Counting and Measuring** . In *Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,* ISBN 978-989-758-173-1, pages 141-147. DOI: 10.5220/0005655801410147

#### in Bibtex Style

@conference{icpram16,

author={Bert Klauninger and Horst Eidenberger},

title={Similarity Assessment as a Dual Process Model of Counting and Measuring},

booktitle={Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

year={2016},

pages={141-147},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0005655801410147},

isbn={978-989-758-173-1},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,

TI - Similarity Assessment as a Dual Process Model of Counting and Measuring

SN - 978-989-758-173-1

AU - Klauninger B.

AU - Eidenberger H.

PY - 2016

SP - 141

EP - 147

DO - 10.5220/0005655801410147