Classifying Incomplete Vectors using Decision Trees

Bhekisipho Twala, Raj Pillay, Ramapulana Nkoana

Abstract

An attempt is made to address the problem of classifying incomplete vectors using decision trees. The essence of the approach is the proposal that in supervised learning classification of incomplete vectors can be improved in probabilistic terms. This approach, which is based on the a priori probability of each value determined from the instances at that node of the tree that has specified values, first exploits the total probability and Bayes’ theorems and then the probit and logit model probabilities. The proposed approach (developed in three versions) is evaluated using 21 machine learning datasets from its effect or tolerance of incomplete test data. Experimental results are reported, showing the effectiveness of the proposed approach in comparison with multiple imputation and fractioning of instances strategy.

Download


Paper Citation