Non-interactive Privacy-preserving k-NN Classifier

Hilder V. L. Pereira, Diego F. Aranha

Abstract

Machine learning tasks typically require large amounts of sensitive data to be shared, which is notoriously intrusive in terms of privacy. Outsourcing this computation to the cloud requires the server to be trusted, introducing a non-realistic security assumption and high risk of abuse or data breaches. In this paper, we propose privacy-preserving versions of the k-NN classifier which operate over encrypted data, combining order-preserving encryption and homomorphic encryption. According to our experiments, the privacy-preserving variant achieves the same accuracy as the conventional k-NN classifier, but considerably impacts the original performance. However, the performance penalty is still viable for practical use in sensitive applications when the additional security properties provided by the approach are considered. In particular, the cloud server does not need to be trusted beyond correct execution of the protocol and computes the algorithm over encrypted data and encrypted classes. As a result, the cloud server never learns the real dataset values, the number of classes, the query vectors or their classification.

References

  1. Alpaydin, E. (2004). Introduction to Machine Learning. The MIT Press.
  2. Altman, N. S. (1992). An introduction to kernel and nearestneighbor nonparametric regression. The American Statistician, 46(3):175-185.
  3. Boldyreva, A., Chenette, N., and O'Neill, A. (2011). Orderpreserving encryption revisited: Improved security analysis and alternative solutions. In CRYPTO, volume 6841 of Lecture Notes in Computer Science, pages 578-595. Springer.
  4. Bost, R., Popa, R. A., Tu, S., and Goldwasser, S. (2015). Machine learning classification over encrypted data. In NDSS. The Internet Society.
  5. Choi, S., Ghinita, G., Lim, H., and Bertino, E. (2014). Secure knn query processing in untrusted cloud environments. IEEE Trans. Knowl. Data Eng., 26(11):2818- 2831.
  6. Elmehdwi, Y., Samanthula, B. K., and Jiang, W. (2014). Secure k-nearest neighbor query over encrypted data in outsourced environments. In ICDE, pages 664-675. IEEE Computer Society.
  7. Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K. E., Naehrig, M., and Wernsing, J. (2016). Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In ICML, volume 48 of JMLR Workshop and Conference Proceedings, pages 201-210. JMLR.org.
  8. Giry, D. (2015). Cryptographic key length recomendation. https://www.keylength.com/ (Acessed December 16, 2016).
  9. Graepel, T., Lauter, K. E., and Naehrig, M. (2012). ML confidential: Machine learning on encrypted data. In ICISC, volume 7839 of Lecture Notes in Computer Science, pages 1-21. Springer.
  10. Hirt, M. and Sako, K. (2000). Efficient receipt-free voting based on homomorphic encryption. In EUROCRYPT, volume 1807 of Lecture Notes in Computer Science, pages 539-556. Springer.
  11. Jha, S., Kruger, L., and McDaniel, P. D. (2005). Privacy preserving clustering. In ESORICS, volume 3679 of Lecture Notes in Computer Science, pages 397-417. Springer.
  12. Lindell, Y. and Pinkas, B. (2009). Secure multiparty computation for privacy-preserving data mining. Journal of Privacy and Confidentiality , 1(1):5.
  13. Miller, C. C. (2014). Revelations of N.S.A. spying cost U.S. tech companies. http://www.nytimes.com/2014/03/22/business/falloutfrom-snowden-hurting-bottom -line-of-techcompanies.html (Acessed December 16, 2016).
  14. Naehrig, M., Lauter, K. E., and Vaikuntanathan, V. (2011). Can homomorphic encryption be practical? In CCSW, pages 113-124. ACM.
  15. Naveed, M., Kamara, S., and Wright, C. V. (2015). Inference attacks on property-preserving encrypted databases. In ACM Conference on Computer and Communications Security, pages 644-655. ACM.
  16. Paillier, P. (1999). Public-key cryptosystems based on composite degree residuosity classes. In EUROCRYPT, volume 1592 of Lecture Notes in Computer Science, pages 223-238. Springer.
  17. Rivest, R. L., Adleman, L., and Dertouzos, M. L. (1978). On data banks and privacy homomorphisms. Foundations of secure computation, 4(11):169-180.
  18. Samanthula, B. K., Elmehdwi, Y., and Jiang, W. (2015). knearest neighbor classification over semantically secure encrypted relational data. IEEE Trans. Knowl. Data Eng., 27(5):1261-1273.
  19. Wong, W. K., Cheung, D. W., Kao, B., and Mamoulis, N. (2009). Secure knn computation on encrypted databases. In SIGMOD Conference, pages 139-152. ACM.
  20. Xiong, L., Chitti, S., and Liu, L. (2006). k nearest neighbor classification across multiple private databases. In CIKM, pages 840-841. ACM.
  21. Xiong, L., Chitti, S., and Liu, L. (2007). Mining multiple private databases using a knn classifier. In SAC, pages 435-440. ACM.
  22. Zhan, J. Z., Chang, L., and Matwin, S. (2005). Privacy preserving k-nearest neighbor classification. I. J. Network Security, 1(1):46-51.
  23. Zhu, Y., Xu, R., and Takagi, T. (2013). Secure k-nn query on encrypted cloud database without key-sharing. IJESDF, 5(3/4):201-217.
Download


Paper Citation


in Harvard Style

V. L. Pereira H. and F. Aranha D. (2017). Non-interactive Privacy-preserving k-NN Classifier . In Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-209-7, pages 362-371. DOI: 10.5220/0006187703620371


in Bibtex Style

@conference{icissp17,
author={Hilder V. L. Pereira and Diego F. Aranha},
title={Non-interactive Privacy-preserving k-NN Classifier},
booktitle={Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2017},
pages={362-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006187703620371},
isbn={978-989-758-209-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Non-interactive Privacy-preserving k-NN Classifier
SN - 978-989-758-209-7
AU - V. L. Pereira H.
AU - F. Aranha D.
PY - 2017
SP - 362
EP - 371
DO - 10.5220/0006187703620371