TOWARDS A COMBINED APPROACH TO FEATURE SELECTION

Camelia Vidrighin Bratu, Rodica Potolea

2008

Abstract

Feature selection is an important step in any data mining process, for many reasons. In this paper we consider the improvement of the prediction accuracy as the main goal of a feature selection method. We focus on an existing 3-step formalism, including a generation procedure, evaluation function and validation procedure. The performance evaluations have yielded that no individual 3-tuple (generation, evaluation and validation procedure) can be identified such that it achieves best performance on any dataset, with any learning algorithm. Moreover, the experimental results suggest the possibility of tackling a combined approach to the feature selection problem. So far we have experienced with the combination of several generation procedures, but we believe that the evaluation functions can also be successfully combined.

References

  1. Almuallim, H., Dietterich, T. G., 1997. “Learning with many irrelevant features”, In Proceedings of Ninth National Conference on AI, pp. 547-552.
  2. Cheeseman, P., Stutz, J., 1995. Bayesian classification (AutoClass): Theory and results. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, pp. 153-180.
  3. Dash, M., Liu, H., 1997. Feature Selection for Classification. In Intelligent Data Analysis 1, 131- 156. INSTICC Press.
  4. Freund, Y., Schapire, R., 1997. A decision-theoretic generalization of on-line learning and an application to boosting. In Journal of Computer and System Sciences, 55(1):119-139.
  5. Hall, M. A., Holmes, G., 2003. Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. In IEEE Transactions on Knowledge and Data Engineering, v.15 n.6, p.1437-1447.
  6. Kira, K., Rendell, L. A., 1992. “The feature selection problem - Traditional methods and a new algorithm”, In Proceedings of Ninth National Conference on AI, pp. 129-134.
  7. Kohavi R., John, J. H., 1997, “Wrappers for feature subset selection”, Artificial Intelligence, Volume 7, Issue 1-2.
  8. John, G.H., 1997. Enhancements to the Data Mining Process. PhD Thesis, Computer Science Department, School of Engineering, Stanford University.
  9. John, G.H., Kohavi, R., Pfleger, K., 1994. Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, 121-129.
  10. Liu, H., Setiono, R., 1996. “A probabilistic approach to feature selection-a filter solution”, In Proceedings of International Conference on Machine Learning, pp. 319-327.
  11. Moldovan, T., Vidrighin, C., Giurgiu, I., Potolea, R., 2007. "Evidence Combination for Baseline Accuracy Determination". Proceedings of the 3rd ICCP, 6-8 September, Cluj-Napoca, Romania, pp. 41-48.
  12. Molina L. C., Belanche L., Nebot, A., 2002. “Feature Selection Algorithms: A Survey and Experimental Evaluation”, In Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM'02).
  13. Nilsson, R., 2007. Statistical Feature Selection, with Applications in Life Science, PhD Thesis, Linkoping University.
  14. Onaci, A., Vidrighin, C., Cuibus, M., Potolea, R., 2007. "Enhancing Classifiers through Neural Network Ensembles". Proceedings of the 3rd ICCP, 6-8 September, Cluj-Napoca, Romania, pp. 57-64.
  15. Witten, I., Frank E., 2005. Data Mining: Practical machine learning tools and techniques, 2nd edition, Morgan Kaufmann.
Download


Paper Citation


in Harvard Style

Vidrighin Bratu C. and Potolea R. (2008). TOWARDS A COMBINED APPROACH TO FEATURE SELECTION . In Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT, ISBN 978-989-8111-53-1, pages 134-139. DOI: 10.5220/0001878401340139


in Bibtex Style

@conference{icsoft08,
author={Camelia Vidrighin Bratu and Rodica Potolea},
title={TOWARDS A COMBINED APPROACH TO FEATURE SELECTION},
booktitle={Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,},
year={2008},
pages={134-139},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001878401340139},
isbn={978-989-8111-53-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,
TI - TOWARDS A COMBINED APPROACH TO FEATURE SELECTION
SN - 978-989-8111-53-1
AU - Vidrighin Bratu C.
AU - Potolea R.
PY - 2008
SP - 134
EP - 139
DO - 10.5220/0001878401340139