IMPROVING THE PERFORMANCE OF THE SUPPORT VECTOR MACHINE IN INSURANCE RISK CLASSIFICATION - A Comparitive Study

Mlungisi Duma, Bhekisipho Twala, Tshilidzi Marwala, Fulufhelo V. Nelwamondo

2011

Abstract

The support vector machine is a classification technique used in linear and non- linear complex problems. It was shown that the performance of the technique decreases significantly in the presence of escalating missing data in the insurance domain. Furthermore the resilience of the technique when the quality of the data deteriorates is weak. When dealing with missing data, the support vector machine uses the mean-mode strategy to replace missing values. In this paper, we propose the use of the autoassociative network and the genetic algorithm as alternative strategies to help improve the classification performance as well as increase the resilience of the technique. A comparative study is conducted to see which of the techniques helps the support vector machine improve in performance and sustain resilience. The training data with completely observable data is used to construct the support vector machine and testing data with missing values is used to measuring the accuracy. The results show that both models help increase resilience with the autoassociative network showing better overall performance improvement.

References

  1. Balasubramanian, D., Srinivasan, P., Gurupatham, R., 2007. Automatic Classification of Focal Lesions in Ultrasound Liver Images using Principal Component Analysis and Neural Networks. In AICIE'07, 29th Annual International Conference of the IEEE EMBS, pp. 2134 - 2137, Lyon, France.
  2. Bishop, C. M., 1995.Neural Network for Pattern Recognition. Oxford University Press, New York, USA.
  3. Chen, M., Zhengwei, Y., 2008. Classification Techniques of Neural Networks Using Improved Genetic Algorithms. In ICGEC'08, 2nd International Conference on Genetic and Evolutionary Computing, pp.115 - 199, IEEE Computer Society, Washington, DC, USA.
  4. Crump, D., 2009. Why People Don't Buy Insurance. Ezine Articles. (Source: http://ezinearticles.com/ ?cat=Insurance).
  5. Duma, M., Twala, B., Marwala, T., Nelwamondo, F. V., 2010. Classification Performance Measure Using Missing Insurance Data: A Comparison Between Supervised Learning Models. In ICCCI'10 International Conference on Computer and Computational Intelligence, pp. 550 - 555, Nanning, China.
  6. Francis, L., 2005. Dancing With Dirty Data: Methods for Exploring and Cleaning Data. Casualty Actuarial Society Forum Casualty Actuarial Society, pp. 198- 254, Virgina, USA. (Source:http://www.casact.org/ pubs/forum/05wforum/05wf198.pdf)
  7. Howe,C., 2010. Top Reasons Auto Insurance Companies DropPeople.eHow. (Source:http://www.ehow.com/fact s_6141822_top- insurance-companiesdrop-people. html).
  8. Leke, B., B., Marwala T., Tettey T., 2006. Autoencoder Networks for HIV Classification. Current Science, vol. 91, no. 11.
  9. Little, R., J., A., Rubin, D., B., 1987. Statistical Analysis with Missing Data. Wiley New York, USA.
  10. Marwala T., 2001. Fault Identification using neural network and vibration data. Unpublished doctoral thesis, University of Cambridge, Cambridge.
  11. Marwala, T., Chakraverty, S., 2006. Fault Classification in Structures with Incomplete Measured Data using Autoassociative Neural Networks and Genetic Algorithm. Current Science, vol. 90, no. 4.
  12. Marwala, T., 2007. Bayesian Training of Neural Networks using Genetic Programming.
  13. Marwala, T., 2009. Computational Intelligence for Missing Data Imputation Estimation and Management Knowledge Optimization Techniques, Information Science Reference, Hershey, New York, USA.
  14. Marivate, V., N., Nelwamodo, F., V., Marwala, T., 2007. Autoencoder, Principal Component Analysis and Support Vector Regression for Data Imputation. CoRR.
  15. Michalewicz, Z., 1996. Genetic algorithms + data structures = evolution programs. Springer-Verlag, New York, USA.
  16. Minaei-Bidgoli, B., Kortemeyer, G., Punch W., F., 2004. Optimizing Classification Ensembles via a Genetic Algorithm for a Web-based Educational System. Lecture Notes in Computer Science, vol. 3138, pp. 397-406.
  17. Pandit, M., Gupta, M., 2011. Image Recognition With the Help of Auto-Associative Neural Network. International Journal of Computer Science and Security, vol. 5, no. 1.
  18. Steeb, W-H., 2008. The Nonlinear Workbook - 4th Edition World Scientific, Singapore.
Download


Paper Citation


in Harvard Style

Duma M., Twala B., Marwala T. and V. Nelwamondo F. (2011). IMPROVING THE PERFORMANCE OF THE SUPPORT VECTOR MACHINE IN INSURANCE RISK CLASSIFICATION - A Comparitive Study . In Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2011) ISBN 978-989-8425-84-3, pages 340-346. DOI: 10.5220/0003673803400346


in Bibtex Style

@conference{ncta11,
author={Mlungisi Duma and Bhekisipho Twala and Tshilidzi Marwala and Fulufhelo V. Nelwamondo},
title={IMPROVING THE PERFORMANCE OF THE SUPPORT VECTOR MACHINE IN INSURANCE RISK CLASSIFICATION - A Comparitive Study},
booktitle={Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2011)},
year={2011},
pages={340-346},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003673803400346},
isbn={978-989-8425-84-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2011)
TI - IMPROVING THE PERFORMANCE OF THE SUPPORT VECTOR MACHINE IN INSURANCE RISK CLASSIFICATION - A Comparitive Study
SN - 978-989-8425-84-3
AU - Duma M.
AU - Twala B.
AU - Marwala T.
AU - V. Nelwamondo F.
PY - 2011
SP - 340
EP - 346
DO - 10.5220/0003673803400346