
Figure 3: 2nd KAN layer plot after the training on Click-
Through Rate (Aden and Wang, 2012).
6 CONCLUSION AND FUTURE
WORK
This paper presents AMAKAN, a fully interpretable
version of the Adaptive Multiscale Deep Neural Net-
work architecture for tabular data classification, re-
placing the last two dense layers of the original model
with Kolmogorov–Arnold Network layers. This en-
hancement takes advantage of the inherent flexibil-
ity of Kolmogorov–Arnold Networks, which replace
fixed activations with spline-based, learnable func-
tions, thereby offering explicit insight into feature
transformations at every stage of the network.
Empirical evaluations conducted over several
datasets demonstrate that the proposed modified ar-
chitecture performs better than the original Adaptive
Multiscale Deep Neural Network in the majority of
the test cases with higher F1-weighted scores. The
rationale behind such improvement is the fact that
the Kolmogorov–Arnold Network layers are adaptive
and thus efficiently learn complicated nonlinear map-
pings of the input data, thus offering a better and more
precise feature representation compared to standard
dense layers.
One of the most important advantages of Kol-
mogorov–Arnold Networks is the increased inter-
pretability. In contrast to dense layers founded on
linear transformations followed by fixed activation
functions, Kolmogorov–Arnold Networks allow one
to utilize visualizable and learnable spline functions,
which are explicit descriptions of complicated non-
linear transformations. This allows to clearly see the
contribution of every feature to predictions with con-
siderably greater transparency.
Looking into the future, several promising direc-
tions for future work have appeared. One direction is
the extension of this hybrid architecture to regression
problems, which may further cement the general use-
fulness and applicability of the method beyond clas-
sification problems. Another direction is the appli-
cation of other interpretability techniques instead of
Kolmogorov–Arnold Networks for the final two lay-
ers, which may yield novel insights and tools, and po-
tentially give a deeper insight into feature relations
and interactions in the data.
Overall, the integration of Adaptive Multiscale
Attention mechanisms with Kolmogorov–Arnold
Networks represents a significant advance towards
fully interpretable, efficient, and effective deep learn-
ing models for the specific needs of tabular data anal-
ysis. The hybrid solution is an attractive direction
of future research that addresses the essential trade-
off between model interpretability and performance
in real-world machine learning applications.
REFERENCES
Aden and Wang, Y. (2012). Kdd cup 2012, track 2. https:
//kaggle.com/competitions/kddcup2012-track2.
Antal, B. and Hajdu, A. (2014). Diabetic Retinopathy De-
brecen. UCI Machine Learning Repository. DOI:
https://doi.org/10.24432/C5XP4P.
Arik, S.
¨
O. and Pfister, T. (2021). Tabnet: Attentive inter-
pretable tabular learning. In Proceedings of the AAAI
conference on artificial intelligence, volume 35, pages
6679–6687.
Aversano, L., Bernardi, M. L., Cimitile, M., Iammarino,
M., and Verdone, C. (2023). A data-aware explain-
able deep learning approach for next activity predic-
tion. Engineering Applications of Artificial Intelli-
gence, 126. Cited by: 7.
Baldi, P., Sadowski, P., and Whiteson, D. (2014). Searching
for Exotic Particles in High-Energy Physics with Deep
Learning. Nature Commun., 5:4308.
Bozorgasl, Z. and Chen, H. Wav-kan: Wavelet
kolmogorov-arnold networks, 2024. arXiv preprint
arXiv:2405.12832.
Breiman, L. (2001). Random forests. Machine learning,
45:5–32.
Chen, T. and Guestrin, C. (2016). Xgboost: A scalable
tree boosting system. In Proceedings of the 22nd acm
sigkdd international conference on knowledge discov-
ery and data mining, pages 785–794.
De Carlo, G., Mastropietro, A., and Anagnostopoulos, A.
(2024). Kolmogorov-arnold graph neural networks.
arXiv preprint arXiv:2406.18354.
Dentamaro, V., Giglio, P., Impedovo, D., Pirlo, G., and
Ciano, M. D. (2024). An interpretable adaptive mul-
tiscale attention deep neural network for tabular data.
IEEE Transactions on Neural Networks and Learning
Systems, pages 1–15.
Dentamaro, V., Impedovo, D., and Pirlo, G. (2018). Licic:
less important components for imbalanced multiclass
classification. Information, 9(12):317.
Dentamaro, V., Impedovo, D., and Pirlo, G. (2021). An
analysis of tasks and features for neuro-degenerative
disease assessment by handwriting. In International
Conference on Pattern Recognition, pages 536–545.
Springer.
AMAKAN: Fully Interpretable Adaptive Multiscale Attention Through Kolmogorov-Arnold Networks
807