# AN HYBRID APPROACH TO FEATURE SELECTION FOR MIXED CATEGORICAL AND CONTINUOUS DATA

### Gauthier Doquire, Michel Verleysen

#### Abstract

This paper proposes an algorithm for feature selection in the case of mixed data. It consists in ranking independently the categorical and the continuous features before recombining them according to the accuracy of a classifier. The popular mutual information criterion is used in both ranking procedures. The proposed algorithm thus avoids the use of any similarity measure between samples described by continuous and categorical attributes, which can be unadapted to many real-world problems. It is able to effectively detect the most useful features of each type and its effectiveness is experimentally demonstrated on four real-world data sets.

#### References

