loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Mikhail Hushchyn 1 ; Philippe Charpentier 2 and Andrey Ustyuzhanin 3

Affiliations: 1 Yandex School of Data Analysis, Yandex Data Factory and Moscow Institute of Physics and Technology, Russian Federation ; 2 CERN, Switzerland ; 3 Yandex School of Data Analysis, Yandex Data Factory, Moscow Institute of Physics and Technology, National Research University Higher School of Economics (HSE) and NRC Kurchatov Institute, Russian Federation

Keyword(s): Structured Data Analysis and Statistical Methods, Machine Learning, Information Extraction, Hybrid Data Storage Systems, Data Management, LHCb.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Computational Intelligence ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Structured Data Analysis and Statistical Methods ; Symbolic Systems

Abstract: This paper presents how machine learning algorithms and methods of statistics can be implemented to data management in hybrid data storage systems. Basicly, two different storage types are used to store data in the hybrid data storage systems. Keeping rarely used data on cheap and slow storages of type one and often used data on fast and expensive storages of type two helps to achieve optimal performance/cost ratio for the system. We use classification algorithms to estimate probability that the data will often used in future. Then, using the risks analysis we define where the data should be stored. We show how to estimate optimal number of replicas of the data using regression algorithms and Hidden Markov Model. Based on the probability, risks and the optimal nuber of data replicas our system finds optimal data distribution in the hybrid data storage system. We present the results of simulation of our method for LHCb hybrid data storage.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.91.8.23

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Hushchyn, M.; Charpentier, P. and Ustyuzhanin, A. (2015). Distributed Data Replication and Access Optimization for LHCb Storage System - A Position Paper. In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR; ISBN 978-989-758-158-8; ISSN 2184-3228, SciTePress, pages 537-540. DOI: 10.5220/0005647105370540

@conference{kdir15,
author={Mikhail Hushchyn. and Philippe Charpentier. and Andrey Ustyuzhanin.},
title={Distributed Data Replication and Access Optimization for LHCb Storage System - A Position Paper},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR},
year={2015},
pages={537-540},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005647105370540},
isbn={978-989-758-158-8},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR
TI - Distributed Data Replication and Access Optimization for LHCb Storage System - A Position Paper
SN - 978-989-758-158-8
IS - 2184-3228
AU - Hushchyn, M.
AU - Charpentier, P.
AU - Ustyuzhanin, A.
PY - 2015
SP - 537
EP - 540
DO - 10.5220/0005647105370540
PB - SciTePress