Democratizing the Access to Geospatial Data: The Performance

Bottleneck of Uniﬁed Data Interfaces

Matthias Pohl

1 a

, Arne Osterthun

1 b

, Joshua Reibert

1 c

, Dennis Gehrmann

, Christian Haertel

2 d

Daniel Staegemann

2 e

and Klaus Turowski

2 f

German Aerospace Center (DLR) - Institute of Data Science, Jena, Germany

Faculty of Computer Science, Otto von Guericke University, Magdeburg, Germany

ﬁ ﬁ

Keywords:

Benchmark, Array File Formats, Geospatial Data, Raster Data Management, OGC.

Abstract:

The exponential growth of geodata collected through satellites, weather stations, and other measuring systems

presents signiﬁcant challenges for efﬁcient data management and analysis. High-resolution datasets from

Earth observation missions like ESA Sentinel are vital for climate research, weather forecasting, and environ-

mental monitoring, yet their multidimensional nature and temporal depth increasingly strain traditional spatial

data models. This research evaluates the performance characteristics of various geospatial data access and

processing technologies through systematic benchmarking. The study compares ﬁle formats (NetCDF, Zarr)

and interface standards (OpenEO, OGC WCS) on a Rasdaman database instance to determine optimal con-

ﬁgurations for interactive data exploration and analysis workﬂows. Performance tests reveal a clear hierarchy

in processing efﬁciency. Direct RasQL queries consistently outperform both OGC WCS and OpenEO API

interfaces across all test scenarios. Array ﬁle formats demonstrate superior query processing speeds, likely

attributable to reduced database technology overhead. The ﬁndings provide a robust foundation for select-

ing appropriate geodata processing technologies based on speciﬁc use case requirements, data volumes, and

performance needs. This research contributes to enhancing the efﬁciency of multidimensional geospatial data

handling, particularly for time-critical applications that require interactive visualization and analysis capabili-

ties.

1 INTRODUCTION

The continuous collection of geodata by satellites,

weather stations, and other measuring systems gen-

erates ever-growing amounts of data. In particular,

Earth observation satellites such as the ESA Sen-

tinel mission generate high-resolution data sets that

are crucial for climate research, weather forecasting,

and environmental monitoring. The efﬁcient manage-

ment and analysis of this multidimensional geospatial

data presents complex challenges for scientists and

data analysts. Traditional models for analyzing spatial

data are increasingly reaching their limits, especially

due to the ever-growing temporal dimension of data

https://orcid.org/0000-0002-6241-7675

https://orcid.org/0000-0001-6455-9119

https://orcid.org/0000-0002-5626-7869

https://orcid.org/0009-0001-4904-5643

https://orcid.org/0000-0001-9957-1003

https://orcid.org/0000-0002-4388-8914

series. While interactive visualizations are an indis-

pensable tool for intuitive data exploration and rapid

insight generation, they are highly dependent on sys-

tem latency (Merticariu et al., 2016, Polte et al., 2009,

Stonebraker et al., 1993). This performance depen-

dency can signiﬁcantly impair analytical capabilities.

To overcome these challenges, various specialized

ﬁle formats and interface standards have been devel-

oped. In the ﬁeld of geospatial data analysis, the in-

terface standards OpenEO (Open Earth Observation)

and OGC WCS (Open Geospatial Consortium - Web

Coverage Service) have become particularly estab-

lished. These interfaces integrate various technolo-

gies and web services to provide comprehensive so-

lutions for managing and analyzing large amounts of

geospatial data. The choice of a suitable format and

the appropriate interface is crucial for the efﬁcient sci-

entiﬁc use of geodata (Merticariu et al., 2016, Polte

et al., 2009, Stonebraker et al., 1993). The models

must ensure both efﬁcient access to spatio-temporal

sections and a high processing speed for interactive

732

Pohl, M., Osterthun, A., Reibert, J., Gehrmann, D., Haertel, C., Staegemann, D., Turowski and K.

Democratizing the Access to Geospatial Data: The Performance Bottleneck of Uniﬁed Data Interfaces.

DOI: 10.5220/0013649000003967

In Proceedings of the 14th International Conference on Data Science, Technology and Applications (DATA 2025), pages 732-738

ISBN: 978-989-758-758-0; ISSN: 2184-285X

analyses. An evaluation of the available ﬁle formats

and interface standards is therefore essential for the

development of efﬁcient geodata analyses.

This research work aims to identify a suitable in-

terface for accessing, processing, and exploring mul-

tidimensional geospatial data using a performance

benchmarking test (Osterthun and Pohl, 2025a). By

analyzing the ﬁle formats NetCDF, Zarr, and the geo-

data interface standards OpenEO and OGC WCS on

a Rasdaman database instance. As the analysis ap-

proach is generalizable, the research is not limited to

that choice. The aim is to determine which constel-

lation enables the most efﬁcient handling and analy-

sis of the data. The knowledge gained will provide

a solid foundation for selecting suitable geodata pro-

cessing technologies.

2 BACKGROUND

2.1 Array-File Formats

This section gives an overview of the NetCDF and

Zarr which are established ﬁle formats for the pro-

cessing of multidimensional and scientiﬁc data.

2.1.1 NetCDF

NetCDF

(Network Common Data Form) is a soft-

ware package that supports the creation, access, and

exchange of array-oriented scientiﬁc data. It is also

an established standard for the exchange of data in

the scientiﬁc community and is used in a variety of

disciplines such as climate research, oceanography,

and meteorology. The NetCDF technology encom-

passes a ﬁle format for deﬁning the structure and

storage of data in NetCDF, a classic and extended

data model that deﬁnes the organization and access of

NetCDF ﬁles, and a software component for working

efﬁciently with NetCDF data.

A classic NetCDF ﬁle format is stored as a sin-

gle ﬁle consisting of a header and the data area. The

header contains all information about dimensions, at-

tributes, and variables, but does not include the actual

variable data. It holds the names, types, and proper-

ties of the dimensions and variables, as well as the

offsets that indicate the position of the variable data

in the ﬁle. Since the header is only as large as nec-

essary to store the existing dimensions, variables, and

attributes, a NetCDF ﬁle requires little additional stor-

age space. This makes the ﬁles compact but has the

disadvantage that any change that requires an exten-

https://www.unidata.ucar.edu/software/netcdf/

sion of the header (e.g., adding new variables) re-

quires the data to be moved. The data area con-

sists of ﬁxed-size data and record data. The ﬁxed-

size data contains the data for variables without un-

bounded dimensions. The data is stored contiguously.

The record data contains variable-sized data for vari-

ables with unbounded dimensions. This is organized

into a series of ﬁxed-size data sets.

The classic NetCDF data model consists of vari-

ables, dimensions, and attributes. This type of data or-

ganization was introduced with the very ﬁrst NetCDF

version and continues to form the core of all NetCDF

ﬁles. Variables in NetCDF ﬁles can be of one of six

types: char, byte, short, int, ﬂoat, or double. The axes

of the data arrays in NetCDF ﬁles are described by

dimensions. Each dimension has a name and a ﬁxed

length. An unlimited dimension has a length that can

be extended whenever more data is written. NetCDF

ﬁles can contain, at most, one unlimited dimension.

Attributes are used to tag variables or ﬁles with short

notes or additional metadata. Attributes are always

scalars or 1D arrays and can be associated with either

a variable or the entire ﬁle. Although there are no hard

limits, it is recommended that attributes be kept small.

Groups are used to organize data hierarchically, simi-

lar to directories in a Unix ﬁle system.

NetCDF ﬁles can contain groups, variables, di-

mensions, and attributes. In this way, a group acts as

a container for the classic NetCDF dataset. NetCDF

ﬁles can also contain many groups that are organized

hierarchically. Since each group acts as a complete

NetCDF dataset, it is possible to have variables with

the same name in two or more different groups within

the same NetCDF ﬁle. Dimensions have a special

scope. They can be seen by all variables in their group

and all descendant groups. This allows the user to de-

ﬁne dimensions in a top-level group and use them in

many sub-groups.

2.1.2 Zarr

Zarr

is a modern ﬁle format, similar to NetCDF,

designed for the efﬁcient storage and processing of

large, multidimensional arrays. It allows data to be

stored in a distributed and compressed form, mak-

ing it particularly suitable for use in cloud environ-

ments and parallel processing systems. A key differ-

ence to traditional ﬁle formats such as NetCDF lies

in the way Zarr organizes data into chunks. Zarr

differs in its ﬂexible structure, which allows chunks

to be stored as separate ﬁles. This distributed or-

ganization enables the efﬁcient storage and process-

ing of large amounts of data on cloud storage sys-

https://zarr.dev/

Democratizing the Access to Geospatial Data: The Performance Bottleneck of Uniﬁed Data Interfaces

733

tems or distributed ﬁle systems by providing paral-

lel access to many small ﬁles. Zarr’s design also

makes it easier to access data directly in cloud en-

vironments which often requires additional tools or

customization with NetCDF. In particular, for cloud-

based or distributed applications, Zarr offers certain

advantages over conventional ﬁle formats due to its

distributed chunking architecture. Zarr can be consid-

ered a cloud-optimized evolution of NetCDF ﬁle for-

mats, as it utilizes a similar data model but is specif-

ically designed for distributed and scalable environ-

ments. The .zarray ﬁle from the .zgroup container

describes how the binary data is encoded by speci-

fying unique key representations. The .zattrs ﬁle

contains metadata with additional information about

the array, including its shape, data type, and other rel-

evant properties. Below .zarray and .zattrs, there

are several rectangular ﬁelds, each labeled 0.0, 1.0,

2.0, which represent the individual chunks of the ar-

ray.

We focus on version 2 of the speciﬁcation, which

has been widely adopted and proven practical in nu-

merous projects. However, version 3 of Zarr speciﬁ-

cation is already available.

.zgroup

.zarray .zattrs

0.0 0.1

1.0

2.0

1.1

2.1

(a) Zarr

File

Group

Subgroups

Dimensions

Variables

Attributes

(b) NetCDF

Figure 1: Overview of ﬁle format speciﬁcations.

2.2 Rasdaman

Rasdaman

(Raster Data Manager) is a Linux-based

database management system designed speciﬁcally

for storing and analyzing multidimensional, raster-

based geospatial data. Rasdaman stores arrays within

a standard database by partitioning them into smaller

segments known as tiles. Each tile is stored as a

BLOB (binary large object) in a relational table. This

approach enables conventional relational database

systems to manage arrays of unlimited size, utiliz-

ing PostgreSQL as the underlying system for meta-

data management. The overall focus of Rasdaman is

on the efﬁcient management and processing of large

amounts of geospatial raster data. Rasdaman uses a

proprietary query language called RasQL (Rasdaman

Query Language), which is speciﬁcally optimized for

http://www.rasdaman.org/

querying raster data (Rasdaman, 2024).

Rasdaman implements the geodata interface stan-

dards OpenEO and OGC WCS which are examined

in the following sections.

2.2.1 openEO

OpenEO

is an open application programming inter-

face (API) for standardized geodata processing. The

goal of OpenEO is to create a uniﬁed and simpli-

ﬁed way to access and process various Earth obser-

vation platforms and services. Developers and sci-

entists can thus write scripts to process Earth Obser-

vation (EO) data independently of the data provision

platforms used (e.g., Google Earth Engine, Sentinel

Hub, or Open Data Cube). The API supports vari-

ous programming languages such as Python, R, and

JavaScript (Gomes et al., 2020). The architecture of

OpenEO follows a modular three-layer design, com-

prising a client layer, a processing layer, and a back-

end layer. The client layer includes the various client

APIs that enable users to access and process EO data.

The processing layer is represented by the core API,

which acts as a central logic layer and standardizes the

requests of the client APIs. Finally, the backend layer

connects the core API to the various backend services

that perform the actual data processing and storage

via the driver APIs. Various scenarios of a possible

OpenEO architecture are shown schematically in the

Figure 2 (Schramm et al., 2021).

The client layer contains the entry points for de-

veloping Earth Observation analyses. Packages and

modules in the R, Python, and JavaScript program-

ming languages are available for this purpose, en-

abling researchers to process EO data and create anal-

yses (Schramm et al., 2021).

The processing layer is represented by the core

API. This is responsible for standardizing the requests

from the client APIs and creating a uniform inter-

face for accessing various data processing services.

Client Layer

Core

Backend Layer

Database

User Application

Figure 2: openEO Architecture.

https://openeo.org/

DATA 2025 - 14th International Conference on Data Science, Technology and Applications

734

This facilitates the integration of different technolo-

gies for processing Earth observation data. OpenEO

implements its core API as a collection of microser-

vices that adhere to the RESTful architectural style,

which is based on web services utilizing HTTP meth-

ods. The openEO processing layer is organized in

processes and connects these processes in process

graphs. A process describes a speciﬁc operation ap-

plied to EO data, such as computing the Normal-

ized Difference Vegetation Index (NDVI) or ﬁltering

data over a speciﬁc time span. A process graph is a

chain of processes that are executed as a workﬂow.

Each node in the graph represents a process, and the

edges represent the data ﬂows and the order of execu-

tion. The Core API translates the locally implemented

source code into a uniform JavaScript Object Notation

(JSON) format that both humans and machines can

read, which is sent to a compatible backend via HTTP

request. To be compatible with the API, a backend

must provide an OpenEO endpoint that translates the

received standardized process chain into the local re-

quirements. The backend then executes the process

chain, and the resulting data is sent back to the client.

The backend layer includes the driver APIs, which

interface between the core API and the backend ser-

vices. This layer ensures that the data processing

and storage technologies are correctly addressed via

OpenEO endpoints and that the requested data is pro-

cessed. There are concrete implementations of the

OpenEO endpoints for the following backend ser-

vices: GeoPySpark (Geotrellis), Open Data Cube,

Apache Airﬂow / eoDataReaders, Rasdaman.

2.2.2 OGC WCS

The OGC Web Coverage Service (WCS)

provides

a standardized interface that allows users to request

and use geospatial data. The main function of the

WCS is to provide large-scale data such as satellite

images, elevation models, or climate data. Users can

access speciﬁc parts of this data that are relevant to

their analysis. This distinguishes the WCS from other

services of the Open Geospatial Consortium (OGC)

(Baumann, 2010). The architecture of OGC WCS is

divided into three main components: core, extensions,

and application proﬁles, which together form a ﬂexi-

ble and extensible framework for accessing geospatial

data. The WCS Core provides the basic functionality

of the architecture and deﬁnes the fundamental op-

erations required for accessing and processing geo-

data. It is based on the speciﬁcations of the OGC

Web Services Common (OWS Common), which pro-

vides general mechanisms for querying and commu-

https://www.ogc.org/publications/standard/wcs/

nicating between OGC Earth Observation Services,

such as WMS, WFS, and WCS. This includes the def-

inition of requests and responses in a standard for-

mat, supporting various protocols (such as HTTP and

SOAP), and utilizing standards like XML. A central

component of WCS Core is the so-called Core Oper-

ations (comparable to the microservices of OpenEO),

which include the standard services GetCapabilities,

DescribeCoverage, and GetCoverage. At the center

of the WCS core component is the Coverage Imple-

mentation Schema, which deﬁnes a standardized ap-

proach for modeling and processing geospatial raster

data using OGC WCS. Coverages that represent the

basic geospatial data are organized here by a refer-

ence grid structure that allows for a precise descrip-

tion of the data structure and its geospatial extent.

This structure provides the foundation for managing

multidimensional geospatial data, encompassing tem-

poral, spatial, and other dimensions. WCS uses Geog-

raphy Markup Language (GML), an XML-based for-

mat developed by OGC, to model and represent ge-

ographic information. GML is used to describe the

metadata and geospatial information, such as bound-

ing boxes or coordinate reference systems, associated

with the coverages. This ensures that the data is inter-

operable and can be easily processed in different GIS

applications. The data model of the core describes

how the coverages are structured and organized. WCS

supports a variety of ﬁle formats for displaying and

storing this data. The extensions extend the basic

functionality of the WCS to support speciﬁc applica-

tion requirements. These extensions are divided into

two main categories: service extensions and protocol

binding extensions.

3 EXPERIMENTS

The basis for the experiments is explained below. The

decision in favor of Rasdaman as a database system

was primarily due to its native support for both OGC

WCS and OpenEO through ready-made endpoints, as

well as the similar request processing in the back-

ground. Thus, Rasdaman provides a good basis for

comparison when testing the performance of the two

interfaces. The choice of NetCDF as the primary

ﬁle format for the implementation was mainly due

to its native support in Rasdaman for importing and

exporting multidimensional geospatial data. Since

Zarr is also designed for storing scientiﬁc data in a

multi-dimensional format, the conversion from Zarr

to NetCDF is usually straightforward.

Democratizing the Access to Geospatial Data: The Performance Bottleneck of Uniﬁed Data Interfaces

735

3.1 Benchmark Process

The architecture of the benchmarks consists of several

distinct layers, each serving a speciﬁc purpose in data

management and access. At the foundational level,

the ﬁle system interacts directly with the hardware’s

operational drivers, facilitating ﬁle storage and orga-

nization on physical devices. Above this, the storage

engine deﬁnes the logical structure of the data, op-

timizing how it is stored, indexed, and retrieved. It

employs various data models tailored to speciﬁc ac-

cess patterns and includes features like data compres-

sion, encryption, and replication. The middle layer is

the query engine, which interprets and executes user

queries, optimizing access paths through techniques

like query rewriting and cost-based optimization to

ensure efﬁcient data retrieval. Next is the applica-

tion interface layer, providing a standardized mech-

anism for applications to interact with the query en-

gines through APIs and tools that simplify data ac-

cess. At the top, the application environment provides

user-friendly interfaces and analytical tools that en-

able easy interaction with data, empowering users to

access, manipulate, and visualize information for in-

formed decision-making.

3.1.1 Application Stacks

With this multi-layered architecture, data manage-

ment is streamlined for the purpose of the benchmark

experiment. The structure is used to deﬁne the fol-

lowing application stacks. Figure 3 shows the compa-

rability of the technologies.

The application stacks are constructed for the pur-

pose of the benchmarks according to the investigated

uniﬁed data interfaces openEO and OGC WCS as

well as the array data ﬁle formats Zarr and NetCDF.

The chosen programming environment for these ap-

plication stacks is Python, which is a versatile and

widely used language in data science and analytics.

While other languages like Julia and R are also preva-

lent in the ESS domain, Python has been selected as

the primary requirement due to its rich ecosystem and

community support.

In Rasdaman version 10.4, three interconnected

stacks are shown in Figures 3a to 3c, utilizing the

robust Rasdaman blob storage system. Two of these

stacks integrate with a REST API to connect to ope-

nEO and OGC Web Coverage Service (WCS). As a

control instance, the Rasdaman query language rasQL

is used via the rasdapy Python package, enabling efﬁ-

cient interaction with Rasdaman.

Alternative stacks for integrating NetCDF and

Zarr ﬁles have been created. Within the Python en-

vironment, xarray

serves as a key tool for modeling

multidimensional data, offering an intuitive interface.

It supports multiple ﬁle formats, including Zarr and

NetCDF, which enhances its versatility. To utilize the

capabilities of xarray, various packages are needed for

effective data ﬁle access and manipulation (see Figure

3d and 3e).

3.1.2 Procedure

The application stacks are implemented on the desig-

nated infrastructure, as outlined in Section 3.2.3. The

workload is largely determined by the submission of

speciﬁc queries and the data sets used. Initially, the

data undergoes preprocessing, which includes inte-

gration into the Rasdaman database for optimal query

execution, as well as preparation in NetCDF and Zarr

formats for efﬁcient data handling. For benchmark-

ing, a series of queries is submitted through the appli-

cation stacks to access the data. Query runtimes are

measured, with each submission repeated 100 times

to calculate the average performance metrics for ac-

curate comparison.

3.2 Setup

3.2.1 Data

In the ﬁeld of earth system science, multidimensional

spatial data is essential for understanding complex en-

vironmental processes. The ERA-5 reanalysis dataset

(C3S, 2018), available through the Copernicus Cli-

mate Change Service (C3S) in GRIB

and NetCDF

formats, offers a wealth of climate and weather in-

formation from 1940 to the present. This extensive

dataset includes atmospheric, ocean-wave, and land-

surface parameters. The benchmark speciﬁcally fo-

cuses on two critical variables, 2-meter temperature

and soil temperature, across multiple dimensions of

time, latitude, longitude, and, where relevant, altitude.

Spatial dimensions are crucial in user queries, espe-

cially for scientiﬁc users who often seek data related

to these dimensions. Additionally, it is essential to

recognize that the dataset has an hourly time resolu-

tion.

3.2.2 Data Queries

The queries in this workload are designed to mimic

common data access patterns typically seen during

compute tasks. These tasks often involve tempo-

ral and spatial aggregate operations, such as cal-

https://xarray.dev/

https://community.wmo.int/en/activity-areas/wis/grib-

edition-1

DATA 2025 - 14th International Conference on Data Science, Technology and Applications

736

Python (3.10.12)

rasdapy

Rasdaman (10.4)

Rasdaman BLOB

XFS

(a) RasQL

Python (3.10.12)

openEO

Rasdaman (10.4)

Rasdaman BLOB

XFS

(b) openEO

Python (3.10.12)

OGC WCS

Rasdaman (10.4)

Rasdaman BLOB

XFS

Python (3.10.12)

xarray (2024.11.0)

h5netcdf (1.4.1)

hdf5plugin (5.0.0)

NetCDF4

XFS

(d) NetCDF

Python (3.10.12)

xarray (2024.11.0)

Zarr-Python (2.18.2)

Zarr (v2)

XFS

(e) Zarr

Figure 3: Application Stacks for Comparison.

Table 1: List of queries used for the benchmarks.

Query ID Description

Q1 Get 2m temperature for a speciﬁc location and point in time.

Q2–7: Get a [day,week,fortnight,month,quarter,year] worth of 2m temperature data at

a speciﬁc location.

Q8–13: Get a [day,week,fortnight,month,quarter,year] worth of 2m temperature data

for an area on the map the size of Germany’s bounding box.

Q14–19: Get a [day,week,fortnight,month,quarter,year] worth of 2m temperature data

for an area on the map the size of the USA’s bounding box.

Q20–25: Get a [day,week,fortnight,month,quarter,year] worth of 2m temperature data

for an area on the map the size of the EU’s bounding box.

Q26–28: Get a [day,week,fortnight] worth of 2m temperature data for the whole map.

Q29–34: Get a [day,week,fortnight,month,quarter,year] worth of of soil temperature data

at a speciﬁc location and altitude.

Q35–40: Get a [day,week,fortnight,month,quarter,year] worth of soil temperature data

at a speciﬁc location across all altitudes.

culating monthly temperature averages. The work-

load is structured as follows. The most straightfor-

ward query (Q1) involves retrieving data for a spe-

ciﬁc location and a speciﬁc point in time, such as

a temperature query. The complexity of the work-

load increases as the temporal scope is broadened to

include various timeframes: day (Q2), week (Q3),

fortnight (Q4), month (Q5), quarter (Q6), and year

(Q7). The next phase introduces variations in the

spatial dimension. Subsequent queries (Q8–Q13) fo-

cus on retrieving data for speciﬁc locations, begin-

ning with coordinates in Germany, followed by the

USA (Q14–Q19), Europe (Q20–Q25), and, ﬁnally,

the entire globe (Q26–Q28) within constrained time

frames. The ﬁnal set of queries (Q29–Q34) adds al-

titude as an additional spatial variable, focusing on

speciﬁc points across various time frames. Addition-

ally, queries (Q35–Q40) cover all altitude informa-

tion. This structured approach highlights the growing

complexity and diversity of data access patterns in the

workload.

3.2.3 Technical Setup

The benchmarking tests were conducted on a high-

performance workstation with the following speciﬁ-

cations: AMD EPYC 7402P, equipped with 24 cores,

128 GiB of high-speed RAM, dual 8 TB HDDs in

a RAID 0 conﬁguration, Ubuntu 20.04.6 LTS. While

the performance comparisons from these benchmarks

are reliable, it is important to note that absolute

scores can be improved through optimized conﬁgu-

rations, such as ﬁne-tuning software settings or up-

grading storage solutions, to further enhance system

efﬁciency.

4 RESULTS

In this section, we discuss the outcomes of the con-

ducted benchmarks that are presented on an external

repository (Osterthun and Pohl, 2025b). The average

query runtime has been measured for each query set

based on the speciﬁc data conﬁgurations used. The re-

Democratizing the Access to Geospatial Data: The Performance Bottleneck of Uniﬁed Data Interfaces

737

sults can be accessed through an external repository,

where the result set and accompanying ﬁgures are or-

ganized in a comprehensive manner.

For comparative analysis, we include results from

both fortnightly and quarterly time scales, allowing

for a nuanced perspective on performance across dif-

ferent periods. The accompanying ﬁgures utilize a

logarithmic scale, which enhances the clarity of the

differences in query runtimes, enabling a more thor-

ough examination of the data. The center point on

the y-axis is set at 1 second, providing a clear ref-

erence for evaluating performance. Bars extending

downwards indicate runtimes that fall below this cen-

ter point, while those extending upwards signify run-

times that exceed it. This dual-direction represen-

tation in a single ﬁgure facilitates a comprehensive

understanding of how various data queries compare

against the threshold.

When examining the performance of uniﬁed data

interfaces, it becomes evident that utilizing WCS re-

sults in signiﬁcantly lower runtimes compared to ope-

nEO. This suggests that WCS is more efﬁcient for cer-

tain applications. However, it is important to note that

when accessing Rasdaman directly through RasQL,

bypassing the intermediate translation via the REST

API, performance improves even further, illustrating

a more streamlined and effective method for handling

data queries.

As anticipated, it is evident that the query runtime

experiences an increase as the volume of data to be

retrieved grows.

5 CONCLUSION

The performance tests conducted demonstrate a

clear gradation in performance. RasQL consistently

achieves the best performance when compared to

OGC WCS and OpenEO via their respective APIs.

For small to medium data queries, the differences are

moderate, and all three interfaces provide acceptable

response times. For larger data volumes, the per-

formance advantage of RasQL becomes more pro-

nounced. Queries are processed signiﬁcantly faster

in the Array File Formats, which is probably due to

the low overhead of the database technology.

OGC WCS is the recommended interface for user-

friendly access. The full integration in Rasdaman, the

integrated user interface, and the slightly better per-

formance compared to OpenEO make OGC WCS the

recommended choice for most applications. RasQL is

the preferred choice for performance-critical applica-

tions or very large amounts of data. However, using

RasQL in practice is cumbersome because the correct

array indices for temporal and spatial parameters must

ﬁrst be determined for queries, whereas OGC WCS

and OpenEO allow parameterization using real coor-

dinates and timestamps. Nevertheless, direct access

to ﬁle-based formats is recommended for large-scale

data analyses or time-critical applications. The hierar-

chical structure of the ﬁle formats allows for fast and

structured access.

REFERENCES

Baumann, P. (2010). Beyond rasters: introducing the

new ogc web coverage service 2.0. In Proceedings of

the 18th SIGSPATIAL International Conference on Ad-

vances in Geographic Information Systems, GIS ’10,

page 320–329, New York, NY, USA. Association for

Computing Machinery.

C3S (2018). ERA5 hourly data on single levels from 1940

to present.

Gomes, V. C., Queiroz, G. R., and Ferreira, K. R. (2020).

An overview of platforms for big earth observation data

management and analysis. Remote Sensing, 12(8):1253.

Merticariu, G., Misev, D., and Baumann, P. (2016). To-

wards a General Array Database Benchmark: Measur-

ing Storage Access. In Rabl, T., Nambiar, R., Baru,

C., Bhandarkar, M., Poess, M., and Pyne, S., editors,

Big Data Benchmarking, volume 10044, pages 40–67.

Springer International Publishing, Cham. Series Title:

Lecture Notes in Computer Science.

Osterthun, A. and Pohl, M. (2025a). Foxbench: Bench-

mark for n-dimensional array ﬁle formats in data an-

alytics environments. In Datenbanksysteme f

ur Busi-

ness, Technologie und Web (BTW 2025), pages 545–564.

Gesellschaft f

ur Informatik, Bonn.

Osterthun, A. and Pohl, M. (2025b).

Rasdaman data interface benchmarks.

https://doi.org/10.5281/zenodo.15306493.

Polte, M., Lofstead, J., Bent, J., Gibson, G., Klasky, S. A.,

Liu, Q., Parashar, M., Podhorszki, N., Schwan, K.,

Wingate, M., and Wolf, M. (2009). ...and eat it too:

high read performance in write-optimized HPC I/O mid-

dleware ﬁle formats. In Proceedings of the 4th Annual

Workshop on Petascale Data Storage, PDSW ’09, pages

21–25, New York, NY, USA. ACM.

Rasdaman (01.12.2024). Rasdaman v10.4 documentation.

Schramm, M., Pebesma, E., Milenkovi

c, M., Foresta, L.,

Dries, J., Jacob, A., Wagner, W., Mohr, M., Neteler, M.,

Kadunc, M., Miksa, T., Kempeneers, P., Verbesselt, J.,

oßwein, B., Navacchi, C., Lippens, S., and Reiche, J.

(2021). The openeo api–harmonising the use of earth

observation cloud services using virtual data cube func-

tionalities. Remote Sensing, 13:1125.

Stonebraker, M., Frew, J., Gardels, K., and Meredith, J.

(1993). The SEQUOIA 2000 storage benchmark. In Pro-

ceedings of the 1993 ACM SIGMOD international con-

ference on Management of data, pages 2–11, Washington

D.C. USA. ACM.

DATA 2025 - 14th International Conference on Data Science, Technology and Applications

738