Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data

Valentina Guarino, Jessica Gliozzo, Jessica Gliozzo, Ferdinando Clarelli, Béatrice Pignolet, Béatrice Pignolet, Kaalindi Misra, Elisabetta Mascia, Giordano Antonino, Giordano Antonino, Silvia Santoro, Laura Ferré, Laura Ferré, Miryam Cannizzaro, Miryam Cannizzaro, Melissa Sorosina, Roland Liblau, Massimo Filippi, Massimo Filippi, Massimo Filippi, Massimo Filippi, Ettore Mosca, Federica Esposito, Federica Esposito, Giorgio Valentini, Giorgio Valentini, Elena Casiraghi, Elena Casiraghi, Elena Casiraghi

2023

Abstract

Multi-omics data are of paramount importance in biomedicine, providing a comprehensive view of processes underlying disease. They are characterized by high dimensions and are hence affected by the so-called ”curse of dimensionality”, ultimately leading to unreliable estimates. This calls for effective Dimensionality Reduction (DR) techniques to embed the high-dimensional data into a lower-dimensional space. Though effective DR methods have been proposed so far, given the high dimension of the initial dataset unsupervised Feature Selection (FS) techniques are often needed prior to their application. Unfortunately, both unsupervised FS and DR techniques require the dimension of the lower dimensional space to be provided. This is a crucial choice, for which a well-accepted solution has not been defined yet. The Intrinsic Dimension (ID) of a dataset is defined as the minimum number of dimensions that allow representing the data without information loss. Therefore, the ID of a dataset is related to its informativeness and complexity. In this paper, after proposing a blocking ID estimation to leverage state-of-the-art (SOTA) ID estimate methods we present our DR pipeline, whose subsequent FS and DR steps are guided by the ID estimate.

Download


Paper Citation


in Harvard Style

Guarino V., Gliozzo J., Clarelli F., Pignolet B., Misra K., Mascia E., Antonino G., Santoro S., Ferré L., Cannizzaro M., Sorosina M., Liblau R., Filippi M., Mosca E., Esposito F., Valentini G. and Casiraghi E. (2023). Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data. In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS; ISBN 978-989-758-631-6, SciTePress, pages 243-251. DOI: 10.5220/0011775200003414


in Bibtex Style

@conference{bioinformatics23,
author={Valentina Guarino and Jessica Gliozzo and Ferdinando Clarelli and Béatrice Pignolet and Kaalindi Misra and Elisabetta Mascia and Giordano Antonino and Silvia Santoro and Laura Ferré and Miryam Cannizzaro and Melissa Sorosina and Roland Liblau and Massimo Filippi and Ettore Mosca and Federica Esposito and Giorgio Valentini and Elena Casiraghi},
title={Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data},
booktitle={Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS},
year={2023},
pages={243-251},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011775200003414},
isbn={978-989-758-631-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS
TI - Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data
SN - 978-989-758-631-6
AU - Guarino V.
AU - Gliozzo J.
AU - Clarelli F.
AU - Pignolet B.
AU - Misra K.
AU - Mascia E.
AU - Antonino G.
AU - Santoro S.
AU - Ferré L.
AU - Cannizzaro M.
AU - Sorosina M.
AU - Liblau R.
AU - Filippi M.
AU - Mosca E.
AU - Esposito F.
AU - Valentini G.
AU - Casiraghi E.
PY - 2023
SP - 243
EP - 251
DO - 10.5220/0011775200003414
PB - SciTePress