Elevating Data Science Maturity: Toward a Process Model that

Harnesses MLOps

Christian Haertel

1 a

, Daniel Staegemann

1 b

, Matthias Pohl

2 c

and Klaus Turowski

1 d

Magdeburg Research and Competence Cluster VLBA, Otto-von-Guericke-University, Magdeburg, Germany

Institute of Data Science, German Aerospace Center (DLR), Jena, Germany

Keywords:

Data Science, Project Management, Machine Learning, MLOps.

Abstract:

Data Science (DS) uses advanced analytical methods, such as Machine Learning, to extract value from data to

improve organizational performance. However, numerous DS projects fail due to the complexity and difﬁculty

of handling various managerial and technical challenges. Because of shortcomings in existing DS methodolo-

gies, new standardized approaches for DS project management are needed that respect both the business and

data perspectives. In this paper, the concept for a DS process model to address common problems in DS,

including a low level of process maturity and a lack of reproducibility, is outlined. This artifact is developed

using the Design Science Research methodology and relies on MLOps principles to support the development

and operationalization of the analytical artifacts in DS projects.

1 INTRODUCTION

With the prospect of improving ﬁrm performance

in various aspects by extracting valuable knowledge

from data (M

uller et al., 2018; Wamba et al., 2017;

Chen et al., 2012), Data Science (DS) has attracted

signiﬁcant interest. DS constitutes an interdisci-

plinary ﬁeld, involving a complex socio-technical

process (Sharma et al., 2014; Thiess and M

uller,

2018), different technologies (Haertel et al., 2023b),

and diverse competencies (Holtkemper and Beecks,

2024). Unfortunately, most DS projects fail (Venture-

Beat, 2019; Hotz, 2024), indicating a low maturity

in DS for many organizations. According to G

okalp

et al. (2021), achieving success in DS initiatives re-

quires managing organizational and technical aspects

and their various challenges, including poor team co-

ordination, reproducibility, and low process maturity

(Martinez et al., 2021).

Hence, effective project management (PM) is con-

sidered fundamental for positive project outcomes,

especially in DS (Martinez et al., 2021; Saltz and

Shamshurin, 2016). The particularities of DS ne-

cessitate dedicated DS methodologies for its unique

https://orcid.org/0009-0001-4904-5643

https://orcid.org/0000-0001-9957-1003

https://orcid.org/0000-0002-6241-7675

https://orcid.org/0000-0002-4388-8914

tasks and skillsets. While multiple DS process mod-

els (e.g., CRISP-DM) from academic and industry

backgrounds are available (Haertel et al., 2022), the

literature suggests several shortcomings, imposing

the need for new standardized approaches (Saltz and

Krasteva, 2022). Beyond their underutilization in real

DS projects, aspects such as the deﬁnition of roles

and responsibilities, dependency on certain service

providers, and reproducibility stand out as worth im-

proving in contemporary DS process models (Mar-

tinez et al., 2021; Schulz et al., 2020).

Furthermore, PM encompasses more than plan-

ning, budgeting, solving conﬂicts, and managing re-

quirements (Iriarte and Bayona, 2020; G

okay et al.,

2023) and also extends to supporting technical facets

of the undertaking (Haertel et al., 2023b). In DS, this

relates to advanced analytics like Machine Learning

(ML), which are often leveraged to gain value from

data (Rahlmeier and Hopf, 2024). Yet, the develop-

ment, deployment, and maintenance of ML is a difﬁ-

cult venture, suffering from poor traceability, (data)

quality assurance (Sculley et al., 2015), and a lack

of automation (Kreuzberger et al., 2023). Nonethe-

less, guidance for analytics is often neglected in cur-

rent DS process models (Martinez et al., 2021). For

these aspects, ML Operations (MLOps) is a promis-

ing paradigm that extends the DevOps principles

(Symeonidis et al., 2022) to DS to support the cre-

ation of mature, efﬁcient, and robust ML systems by

550

Haertel, C., Staegemann, D., Pohl, M. and Turowski, K.

Elevating Data Science Maturity: Toward a Process Model that Harnesses MLOps.

DOI: 10.5220/0013841000004000

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2025) - Volume 2: KEOD and KMIS, pages

550-557

ISBN: 978-989-758-769-6; ISSN: 2184-3228

leveraging principles such as data, model, and code

versioning, workﬂow orchestration, Continuous Inte-

gration (CI), Deployment (CD), and Training (CT)

(Kreuzberger et al., 2023).

Generally, an effective methodology for DS

management should cover the areas of project, team,

data, and information management (Martinez et al.,

2021). Therefore, we argue that the incorporation of

MLOps into a standardized DS lifecycle workﬂow

that prescribes common tasks, deliverables, and

deﬁnes team roles and responsibilities can mitigate

common managerial and technical challenges and

thus contribute to increasing maturity in DS. Yet,

despite its potential, to the best of our knowledge, no

process model for DS currently integrates MLOps

principles to guide analytical model building, op-

erationalization, and maintenance. Therefore, the

following research question (RQ) is formulated:

RQ: How can a Data Science process model

be designed that incorporates MLOps principles to

improve end-to-end Data Science maturity?

Such an artifact can provide valuable contribu-

tions for scholars and practitioners. For the former,

the development of a holistic DS process model fea-

turing the integration of MLOps principles addresses

a signiﬁcant gap in the academic body of knowledge.

Furthermore, DS practitioners can apply this artifact

to manage DS initiatives since it aims to address both

organizational and technical process aspects.

The rest of the manuscript is structured as follows.

After introducing the methodology and the theoreti-

cal foundation for this research, typical DS challenges

and avenues to address them are discussed. This es-

tablishes the basis for the concept of the MLOps-

based DS process model, which is presented in de-

tail in the ﬁfth section. The paper closes with an out-

look on the next steps for progressing this research

endeavor in the future, emphasizing artifact formal-

ization and evaluation.

2 METHODOLOGY

To ensure rigor, the Design Science Research (DSR)

methodology is adopted (Hevner et al., 2004). In par-

ticular, we rely on the DSR approach of Peffers et al.

(2007), encompassing a nominal process of six stages

for the design, development, and evaluation of an ar-

tifact. Taking a problem-centered entry point, the ﬁrst

stage involves problem identiﬁcation.

Problem Identiﬁcation and Motivation. The high

failure rate in conducting DS initiatives (Hotz, 2024)

is attributed to numerous managerial and technical

difﬁculties, including a low level of process matu-

rity, poor team coordination, a lack of knowledge re-

tention, reproducibility, quality assurance checks, and

low data quality (Martinez et al., 2021). While mul-

tiple DS process models can be found in the litera-

ture (Haertel et al., 2022), these methodologies dis-

play several ﬂaws (Martinez et al., 2021) in providing

sufﬁcient support for addressing common DS chal-

lenges. Hence, the call for new approaches for DS

PM (Saltz and Krasteva, 2022) indicates a low matu-

rity in DS.

Objectives of a Solution. Accordingly, an artifact in

this regard should primarily contribute to standardiz-

ing and consistently implementing DS processes, as

the success of DS projects depends, inter alia, on the

comprehensive management of their managerial and

technical aspects (G

okalp et al., 2021). Consequently,

the frequently encountered issues, which are detailed

in the next section, need to be addressed (Haertel

et al., 2023a; Martinez et al., 2021) by the solution

to support DS project execution.

Design and Development. Therefore, this research

aims to develop a DS process model that tackles the

aforementioned obstacles and contributes to improv-

ing end-to-end DS maturity. This artifact is designed

by integrating a standardized DS project workﬂow,

including common tasks, team roles and responsibili-

ties, and deliverables with MLOps principles from the

literature. In DSR terms, the artifact is characterized

as a method, since it offers ”actionable instructions

that are conceptual” (Peffers et al., 2012). A draft of

this artifact is described in the ﬁfth section.

Demonstration. To showcase the applicability of the

artifact for addressing the problem, it will be demon-

strated through various DS case studies in the future.

Evaluation. To verify the suitability of the arti-

fact to fulﬁll the formulated objectives, the evalu-

ation of the proposed method is conducted accord-

ing to the Build-Evaluate pattern of Sonnenberg and

vom Brocke (2012), consisting of ex ante and ex post

evaluation activities. In particular, evaluation crite-

ria such as feasibility, clarity, understandability, com-

pleteness, and effectiveness need to be considered to

assess the artifact’s asserted contribution (knowledge

claims) (Larsen et al., 2025). Thus, we will mainly

rely on case studies and expert feedback from practi-

tioners with different roles in the context of DS.

Communication. The intermediate and completed

results of this research endeavor are and will be dis-

seminated to different scientiﬁc outlets.

Elevating Data Science Maturity: Toward a Process Model that Harnesses MLOps

551

3 THEORETICAL BACKGROUND

First, key terminology needs to be clariﬁed. DS de-

notes the interdisciplinary ﬁeld that aims to synthe-

size ”useful knowledge directly from data through a

process of discovery or of hypothesis formulation and

hypothesis testing“ (Chang and Grady, 2019). The

concept of maturity, generally understood as a mea-

sure of the quality of an organization’s operations in

a certain domain, can also be applied to DS. In this

context, DS maturity indicates how well an organi-

zation standardizes and consistently implements DS

processes (G

okalp et al., 2021). To support DS initia-

tives in this regard, DS process models aim to provide

a chronological and logical sequence of stages, tasks,

and best practices. For example, based on the work

of Haertel et al. (2022), a DS project can be roughly

structured into six stages, as shown in Figure 1.

As the entry point, Business Understanding in-

volves a situation assessment, deﬁning project ob-

jectives, forming a project team, and creating a

project plan. Next, the Data Collection, Exploration

and Preparation phase includes data acquisition, ex-

ploratory analysis, and consequent data preparation.

This leads to the Analysis stage, where analytical

models are developed and assessed based on the DS

objectives. Here, the process of extracting knowl-

edge from large datasets typically requires advanced

analytics techniques like ML (Rahlmeier and Hopf,

2024). ML encompasses algorithms that learn from

training data to uncover hidden insights and com-

plex patterns without explicit programming (Janiesch

et al., 2021; Bishop, 2006). This capability enables

reliable, data-driven decisions across various applica-

tions. Evaluation reviews whether the models meet

the initial business goals. For a positive checkpoint

decision, Deployment entails planning, testing, and

implementing the analytical artifacts into the produc-

tion environment, depending on the type of DS prod-

uct. Utilization includes monitoring and maintenance

tasks (Haertel et al., 2022).

Recently, MLOps emerged as a paradigm that in-

cludes best practices, concepts, and cultural aspects

for the end-to-end implementation and scalability of

ML products (Kreuzberger et al., 2023). MLOps is

based on key DevOps principles (Symeonidis et al.,

2022) that are extended to the analytics domain. For

example, CI/CD is in place for build, test, deliv-

ery, and deployment steps for data and ML pipelines,

enabling automation together with a workﬂow or-

chestration engine. Components such as a source

code repository, feature store, metadata store, and

model registry implement versioning of data, mod-

els, and code in addition to ensuring reproducibility

and traceability of processes (e.g., for ML experi-

ments) (Kreuzberger et al., 2023). While CI/CD han-

dle the deployment of the ML pipelines, in produc-

tion, Continuous Monitoring (CM) periodically eval-

uates ML components (data, model, and infrastruc-

ture) to detect quality issues (Makinen et al., 2021)

like concept drift, which denotes changes between in-

put data and the target variables that occur over time

(Gama et al., 2014), potentially necessitating peri-

odic retraining (Continuous Training (CT)) on new

data (Kreuzberger et al., 2023). MLOps can be im-

plemented at varying maturity and automation lev-

els, with Google deﬁning three degrees of MLOps

(Kazmierczak et al., 2024).

4 DATA SCIENCE CHALLENGES

AND THEIR MITIGATION

STRATEGIES

DS projects often fail to reach a successful conclu-

sion (Hotz, 2024), which is, inter alia, attributed to

challenges related to PM, team management, and data

and information management (Martinez et al., 2021).

Therefore, to improve the maturity in DS, frequently

encountered obstacles need to be addressed by a DS

PM approach (Martinez et al., 2021; Haertel et al.,

2023a). In the following, some key issues in DS and

propositions to mitigate them are discussed. The ﬁrst

subsection focuses on aspects related to the project

and team. Afterward, data- and information-centric

obstacles are addressed.

4.1 Process Maturity and Team

Coordination

While several DS process models can be found in

gray and academic literature, various shortcomings

are evident, undermining the availability of integral

DS methodologies (Kutzias et al., 2023; Martinez

et al., 2021; Schulz et al., 2020). According to an ex-

tensive survey of DS process models, many method-

ologies underrepresent management tasks, and con-

crete deliverables for the individual DS tasks are lack-

ing (Haertel et al., 2022). Consequently, in practice,

lots of DS projects are managed ad-hoc and forego

the use of DS methodologies to support the execution

(Kutzias et al., 2023). Accordingly, a fundamental

prerequisite of a DS process model to address the low

level of process maturity is a well-deﬁned DS lifecy-

cle workﬂow to determine which and how tasks need

to be performed throughout the initiative (Martinez

et al., 2021).

KMIS 2025 - 17th International Conference on Knowledge Management and Information Systems

552

Figure 1: Data Science Lifecycle, adapted from Haertel et al. (2022).

Due to the variety and complexity of DS tasks,

diverse competencies are required in a DS team

(Holtkemper and Beecks, 2024), and effective coor-

dination is crucial (Martinez et al., 2021). As deﬁn-

ing team roles and responsibilities is considered inte-

gral in PM (Mishra et al., 2022), DS process models

should also cover this aspect. However, most method-

ologies (e.g., CRISP-DM), do not adequately address

roles and responsibilities throughout the DS lifecycle

(Haertel et al., 2022; Saltz and Krasteva, 2022). Tra-

ditionally, the Data Scientist has been the key ﬁgure

in DS, requiring a diverse skillset (e.g., Data Anal-

ysis, Data Engineering, software development, PM,

domain knowledge) (Demchenko et al., 2016), which

is rarely consolidated in one individual (Schumann

et al., 2016). Accordingly, a DS methodology needs

to account for team coordination between different,

interdisciplinary actors (Martinez et al., 2021). As

the literature features a plethora of job proﬁles related

to DS and their speciﬁc demand can differ depend-

ing on project scope and scale, or industry, it is sen-

sible to deﬁne broader DS role categories that cover

the general responsibilities necessary across the DS

lifecycle. Therefore, typical DS team role groups are

Domain Expertise, Strategy & Project Management,

Data Management & Analysis for aspects around data

access, exploration, and preparation, Data Infrastruc-

ture & Operations responsible for the technological

basis for data analyses and all modeling activities,

and Analytical Modeling, which is leading the devel-

opment and evaluation of ML models (Haertel et al.,

2025). To facilitate team coordination with a process

model, the individual DS workﬂow tasks need to be

assigned to the respective responsible actors.

4.2 Reproducibility and Analytical

Models

In many DS process models, a guiding framework for

analytics (i.e., ML) is only superﬁcially covered, and

vendor dependencies aggravate adoptability (e.g., Mi-

crosoft’s Team DS Process) (Martinez et al., 2021).

Building ML models, which involves data input, fea-

ture extraction, model construction, and assessment

(Janiesch et al., 2021), can be complicated, with nu-

merous pitfalls (Kreuzberger et al., 2023). Develop-

ment is just one aspect, and the deployment and main-

tenance of the models are equally important, specif-

ically in the context of changing data and business

environments (Gerhart et al., 2023). In DS projects,

these activities are often accompanied by issues with

reproducibility, knowledge retention, a lack of qual-

ity assurance checks, and low data quality for ML

(Martinez et al., 2021), which hinder leveraging the

full potential of ML. These challenges stress the need

for including ML guidance, particularly with empha-

sis on traceability and preservation of knowledge and

versions about data, models, and code in DS method-

ologies (Martinez et al., 2021).

This requirement aligns with the objectives of

MLOps, which address the typical challenges in

DS around ML. CI and CD automation foster de-

ployment, quality assurance, and ﬂexibility. Trace-

ability and reproducibility of experiments, artifacts,

and pipelines are achieved through components such

as the feature store, model registry, and metadata

store. Thus, MLOps contributes to enhanced trans-

parency in ML processes, improving collaboration

(Kreuzberger et al., 2023). For operationalization, the

literature suggests that strong capabilities in DevOps

are crucial to ensure and maintain the quality of ML

systems and the consequent realization of business

value (Shollo et al., 2022), which MLOps accounts

for with CM and CT (Kreuzberger et al., 2023).

Thus far, no DS process model integrates MLOps

practices and components. Despite its potential to

standardize ML development, deployment, and moni-

toring, MLOps cannot address all challenges encoun-

tered in DS projects. Existing MLOps frameworks

mostly neglect the aspects of Business Understand-

ing and Evaluation (Haertel et al., 2023c). A missing

link between organizational objectives and the tech-

nical perspective can lead to the DS results not be-

ing used by the business (Martinez et al., 2021), espe-

cially when investigating the wrong or no analytical

question (Leek and Peng, 2015; Thiess and M

uller,

2018). Therefore, MLOps practices should be embed-

Elevating Data Science Maturity: Toward a Process Model that Harnesses MLOps

553

ded in a DS methodology to combine the data-centric

view with the management perspective to effectively

support DS project execution (Haertel et al., 2023c;

Martinez et al., 2021).

5 A CONCEPT FOR AN

MLOps-BASED DATA SCIENCE

PROCESS MODEL

This section presents the high-level concept of the

MLOps-based DS process model, illustrated in Fig-

ure 2. For this draft, the notation is oriented toward

the Business Process Model and Notation (BPMN)

standard and will be formalized and detailed with ap-

propriate submodels as a next step. The model aims

to address the discussed managerial and technical is-

sues in DS from the literature by incorporating the

proposed solutions to mitigate them. In particular, for

its construction, the MLOps principles and compo-

nents (Kreuzberger et al., 2023) are integrated into the

DS workﬂow of Haertel et al. (2022) due to its origins

in the best practices of 28 existing DS process mod-

els. Furthermore, the method indicates the responsi-

ble team roles of major DS tasks, identiﬁed based on

a survey of DS roles and competencies (Haertel et al.,

2025). Consequently, this artifact combines a struc-

ture for managing organizational and technical pro-

cess aspects, which is crucial for the maturity and suc-

cess of data-driven initiatives (G

okalp et al., 2021).

The process model is technology-agnostic, allowing

the realization of the individual components through

different tool vendors and (MLOps) automation lev-

els.

First, a notable challenge is bridging the gap be-

tween the business aspects and technical realization.

DS projects come in different variations (e.g., data

characteristics, type of ML product, timeliness of an-

alytics, type of analytical problem), imposing differ-

ent implications for the undertaking. Essentially, the

encountered challenges in DS project execution are

inﬂuenced by the project characteristics (Saltz et al.,

2017). Therefore, a DS categorization model, devel-

oped based on the analysis of a signiﬁcant number of

DS case studies, aims to assign the DS project pro-

posal at hand to a cluster of similar undertakings to

allow for drawing inferences for guiding the project

execution based on the assigned category. This con-

stitutes valuable input for the Business Understanding

phase and can include aspects such as tools, required

data analysis and preparation activities, suitable ana-

lytical models, and conclusions for deployment and

monitoring.

Depending on the requirements of the use case

and the conclusions drawn from the categorization

model output, the actual DS lifecycle (Haertel et al.,

2022) starts with the Business Understanding activ-

ity to plan and prepare the project. Afterward, in

case of continuation, relevant data from the identiﬁed

data sources are acquired and explored to enable de-

veloping the data pipeline to streamline data prepara-

tion steps. Created features and processed datasets

are provided to a feature store to accelerate model

building and predictions (Kreuzberger et al., 2023)

and promote reusability. In the experimentation con-

ducted in the development environment, various (ML)

models are trained and tested based on the feature

data to fulﬁll the DS objectives. Due to the explo-

rative nature of DS (Das et al., 2015), a separation of

environments is essential. To ensure reproducibility

and traceability, tracking and logging of ML metadata

(e.g., training time, (hyper)parameters, performance

metrics, and model lineage) and models are imple-

mented for all training runs and stored in the model

registry and metadata store (Kreuzberger et al., 2023).

All code artifacts for data and modeling are further

held in a source code repository for improved collab-

oration.

After a positive conclusion regarding the satisfac-

tion of business goals in the Evaluation, the Deploy-

ment phase commences, primarily executed by the

Data Infrastructure & Operations team. Instead of

merely commissioning an analytical model, an entire

orchestrated ML pipeline is deployed to the produc-

tion environment. Therefore, the CI/CD component is

leveraged to quickly execute the build, test, delivery,

and deployment tasks, enhancing productivity and en-

abling fast adaptability of the system (Kreuzberger

et al., 2023).

Fed with curated feature data from the feature

store, the deployed ML pipeline is executed, resulting

in production-ready model(s) that are, similar to the

training metadata, recorded in the model registry. Via

CD, the generated model is made available to the ap-

plication domain through the serving component, en-

abling it to support business value creation. Further-

more, a monitoring component supervises model and

system performance and health through a broad set

of metrics (e.g., model accuracy, resource utilization).

Anomalies are investigated and handled by the oper-

ations and analytics team. For instance, the retraining

of the model is enabled via the CT component, which

can trigger the ML pipeline automatically or be sub-

ject to human evaluation beforehand. If the mainte-

nance activities fail to mitigate errors or revert model

and system performance to an acceptable threshold, a

reinitiation of the DS project can be decided when the

KMIS 2025 - 17th International Conference on Knowledge Management and Information Systems

554

Business

Understanding

Data Collection

& Exploration

ML training & workflow code

plan and prepare

ML pipeline

for production

Deployment

and Business

Integration

Model predictions

Value Creation

Logs, reports & alerts

Monitoring

Evaluation

Strategy &

Project

Management

Domain

Expertise

Data

Management

& Analysis

Analytical

Modeling

Data Science Lifecycle

Data

Infrastructure

& Operations

Data Extraction

Data Pipeline

Data

Transformation

& Cleaning

Feature

Engineering

Feature Data

Ingestion

Data Sources

Experiment &

metadata tracking

Data Analysis

Model

Selection

Model

Development

Model

Assessment

build, test, push

Source code

repository

provide feature data

Model

Data pipeline code

Model training code

Training and

metadata tracking

build, test, deploy

feedback

Model

(Prod ready)

Data Extraction

provide feature data

Maintenance &

Infrastructure

Management

trigger

Continuation of the solution?

Model Training

Model

Assessment

Push to

registry

Experimentation

Production ML Pipeline

DS process

flow

Data &

artifact flow

Business goals

satisfied?

CI/CD

component

Serving

component

Monitoring

component

CT component

Problem still

relevant

for business?

Data

Infrastructure

& Operations

Analytical

Modeling

Strategy &

Project

Management

Domain

Expertise

Data

Infrastructure

& Operations

Data

Infrastructure

& Operations

Draft Data

Science Project

Proposal

guides

Proposal

Categorization

Model

DS project

categorization

No Yes

Performance

Insufficient

Yes

Data

sufficient?

Yes

Additional or other

data needed and

available

Yes

Model

performance

sufficient?

Yes

Satisfaction of

business goals

not possible

Stop

project?

Yes

System

discontinuation

inform about

new solution

development

No Yes

Solution

exists?

Analytical

Modeling

Analytical

Modeling

Domain

Expertise

Legend

Discontinuation ordered

Feature Store Model Registry and

Metadata Store

Development Environment

Production Environment

Figure 2: DS process model (concept) based on Haertel et al. (2022) and Kreuzberger et al. (2023).

objective of the solution is still considered relevant for

the organization. Otherwise, the system is discontin-

ued.

In summary, the proposed artifact aims to ad-

dress common DS challenges as follows. The fea-

ture store, model registry, metadata store, and the

source code repository enable versioning, knowledge

retention, transparency, and reproducibility. CI/CD

and the orchestration of data and ML pipelines fos-

ter automation and efﬁciency. Through the separation

of environments and the components for monitoring

and CT, system quality and robustness are ensured.

These MLOps practices are incorporated into an end-

to-end DS workﬂow with deﬁned team roles to es-

tablish process maturity and aid coordination. Never-

theless, while the artifact prescribes the relevant ab-

stract technical components, it does not provide con-

crete guidance for setting up the generally necessary

IT infrastructure for DS projects.

For the future formalization of the process model,

the involvement of the actors can be clariﬁed more

clearly (e.g., via swimlanes). Furthermore, the data

and artifact ﬂow between the components is subject

to a detailed speciﬁcation. The veriﬁcation of the arti-

fact’s utility to fulﬁll the intended objectives requires

a comprehensive evaluation. Therefore, we rely on

the Build-Evaluate pattern of Sonnenberg and vom

Brocke (2012). Brieﬂy, the applicability of the model

is demonstrated by its instantiation in multiple DS

case studies. This will be supplemented by detailed

expert feedback. Here, beyond the artifact’s effective-

ness, its understandability and clarity are major con-

cerns to ensure that practitioners are able to use this

approach for managing DS projects. Accordingly, ex-

perts with different roles and skillsets in the context

of DS should be considered.

6 CONCLUSION AND FUTURE

WORK

The need for new approaches (e.g., process models)

for DS (Saltz and Krasteva, 2022) is motivated by the

high failure rates of DS initiatives (Hotz, 2024) due to

various managerial and technical challenges. Many

DS projects do not follow an established methodol-

ogy, and gaps in current DS process models are high-

lighted in the literature (Kutzias et al., 2023; Martinez

et al., 2021; Schulz et al., 2020). Consequently, this

paper reports on an ongoing research endeavor that

aims to develop an MLOps-based process model for

DS through the adoption of the DSR methodology of

Peffers et al. (2007). A high-level concept of this

artifact is presented, including a discussion on how

it aims to address managerial and data-centric chal-

lenges in DS. The focus of future work will be placed

on continuing the development of the artifact and its

components, including formalization with established

modeling notation. Furthermore, the evaluation will

play a major role in assessing its feasibility for the

Elevating Data Science Maturity: Toward a Process Model that Harnesses MLOps

555

different ﬂavors of DS projects. Moreover, the poten-

tial of incorporating Generative AI capabilities into

DS process models for increasing automation and ef-

ﬁciency in DS project tasks can be explored.

REFERENCES

Bishop, C. M. (2006). Pattern Recognition and Machine

Learning.

Chang, W. L. and Grady, N. (2019). NIST Big Data Inter-

operability Framework: Volume 1, Deﬁnitions.

Chen, H., Chiang, R. H. L., and Storey, V. C. (2012). Busi-

ness Intelligence and Analytics: From Big Data to Big

Impact. MIS Quarterly, 36(4):1165–1188.

Das, M., Cui, R., Campbell, D. R., Agrawal, G., and Ram-

nath, R. (2015). Towards Methods for Systematic Re-

search on Big Data. 2015 IEEE International Confer-

ence on Big Data, pages 2072–2081.

Demchenko, Y., Belloum, A., Los, W., Wiktorski, T.,

Manieri, A., Brocks, H., Becker, J., Heutelbeck, D.,

Hemmje, M., and Brewer, S. (2016). EDISON Data

Science Framework: A Foundation for Building Data

Science Profession for Research and Industry. In 2016

IEEE International Conference on Cloud Computing

Technology and Science (CloudCom), pages 620–626.

IEEE.

Gama, J.,

Zliobait

e, I., Bifet, A., Pechenizkiy, M., and

Bouchachia, A. (2014). A survey on concept drift

adaptation. ACM Computing Surveys, 46(4):1–37.

Gerhart, N., Torres, R., and Giddens, L. (2023). Challenges

in the Model Development Process: Discussions with

Data Scientists. Communications of the Association

for Information Systems, 53(1):591–611.

okalp, M., G

okalp, E., Kayabay, K., Kocyigit, A., and

Eren, P. (2021). Data-driven manufacturing: An as-

sessment model for data science maturity. Journal of

Manufacturing Systems, 60:123–132.

okay, G. T., Nazlıel, K., S¸ener, U., G

okalp, E., G

okalp,

M. O., Genc¸al, N., Da

gdas¸, G., and Eren, P. E. (2023).

What Drives Success in Data Science Projects: A Tax-

onomy of Antecedents. In Garc

ıa M

arquez, F. P.,

Jamil, A., Eken, S., and Hameed, A. A., editors, Com-

putational Intelligence, Data Analytics and Applica-

tions, volume 643 of Lecture Notes in Networks and

Systems, pages 448–462. Springer International Pub-

lishing, Cham.

Haertel, C., Daase, C., Staegemann, D., Nahhas, A., Pohl,

M., and Turowski, K. (2023a). Toward Standard-

ization and Automation of Data Science Projects:

MLOps and Cloud Computing as Facilitators. In

Proceedings of the 15th International Joint Confer-

ence on Knowledge Discovery, Knowledge Engineer-

ing and Knowledge Management, pages 294–302.

SCITEPRESS - Science and Technology Publications.

Haertel, C., Holtkemper, M., Staegemann, D., Beecks, C.,

and Turowski, K. (2025). Unveiling Data Science

Team Roles and Competencies: A Literature-Based

Analysis. AMCIS 2025 Proceedings.

Haertel, C., Pohl, M., Nahhas, A., Staegemann, D., and Tur-

owski, K. (2022). Toward A Lifecycle for Data Sci-

ence: A Literature Review of Data Science Process

Models. PACIS 2022 Proceedings.

Haertel, C., Pohl, M., Nahhas, A., Staegemann, D., and Tur-

owski, K. (2023b). A Survey of Technology Selection

Approaches in Data Science Projects. AMCIS 2023

Proceedings.

Haertel, C., Staegemann, D., Daase, C., Pohl, M., Nahhas,

A., and Turowski, K. (2023c). MLOps in Data Science

Projects: A Review. 2023 IEEE International Confer-

ence on Big Data (BigData), pages 2396–2404.

Hevner, A. R., March, S. T., and Park, J. (2004). Design

Science in Information Systems Research. MIS Quar-

terly, Vol. 28(No. 1):75–106.

Holtkemper, M. and Beecks, C. (2024). Empowering Data

Science Teams: How Automation Frameworks Ad-

dress Competency Gaps Across Project Lifecycles.

In 2024 IEEE International Conference on Big Data

(BigData), pages 3134–3142. IEEE.

Hotz, N. (2024). Why Big Data Science & Data Analytics

Projects Fail.

Iriarte, C. and Bayona, S. (2020). IT projects success fac-

tors: a literature review. International Journal of In-

formation Systems and Project Management, 8(2):49–

78.

Janiesch, C., Zschech, P., and Heinrich, K. (2021). Ma-

chine learning and deep learning. Electronic Markets,

31(3):685–695.

Kazmierczak, J., Salama, K., and Huerta, V. (2024).

MLOps: Continuous delivery and automation

pipelines in machine learning.

Kreuzberger, D., K

uhl, N., and Hirschl, S. (2023). Machine

Learning Operations (MLOps): Overview, Deﬁnition,

and Architecture. IEEE Access, 11:31866–31879.

Kutzias, D., Dukino, C., K

otter, F., and Kett, H. (2023).

Comparative Analysis of Process Models for Data

Science Projects. Proceedings ofthe 15th Interna-

tional Conference on Agents and Artiﬁcial Intelli-

gence (ICAART 2023), pages 1052–1062.

Larsen, K. R., Lukyanenko, R., Mueller, R. M., Storey,

V. C., Parsons, J., VanderMeer, D., and Hovorka, D. S.

(2025). Validity in Design Science. MIS Quarterly.

Leek, J. T. and Peng, R. D. (2015). What is the ques-

tion? Mistaking the type of question being considered

is the most common error in data analysis. Science,

347(6228):1314–1315.

Makinen, S., Skogstrom, H., Laaksonen, E., and Mikkonen,

T. (2021). Who Needs MLOps: What Data Scientists

Seek to Accomplish and How Can MLOps Help? In

2021 IEEE/ACM 1st Workshop on AI Engineering -

Software Engineering for AI (WAIN), pages 109–112.

IEEE.

Martinez, I., Viles, E., and Olaizola, I. G. (2021). Data Sci-

ence Methodologies: Current Challenges and Future

Approaches. Big Data Research 24.

Mishra, A., Tripathi, A., and Khazanchi, D. (2022). A Pro-

posal for Research on the Application of AI/ML in

ITPM. International Journal of Information Technol-

ogy Project Management, 14(1):1–9.

KMIS 2025 - 17th International Conference on Knowledge Management and Information Systems

556

uller, O., Fay, M., and vom Brocke, J. (2018). The Ef-

fect of Big Data and Analytics on Firm Performance:

An Econometric Analysis Considering Industry Char-

acteristics. Journal of Management Information Sys-

tems, 35(2):488–509.

Peffers, K., Rothenberger, M., Tuunanen, T., and Vaezi, R.

(2012). Design Science Research Evaluation. Design

Science Research in Information Systems. Advances in

Theory and Practice. DESRIST 2012., Vol. 7286.

Peffers, K., Tuunanen, T., Rothenberger, M. A., and Chat-

terjee, S. (2007). A Design Science Research Method-

ology for Information Systems Research. Journal of

Management Information Systems, 24(3):45–77.

Rahlmeier, N. and Hopf, K. (2024). Bridging Fields of Prac-

tice: How Boundary Objects Enable Collaboration in

Data Science Initiatives. Wirtschaftsinformatik 2024

Proceedings, 55.

Saltz, J., Shamshurin, I., and Connors, C. (2017). Predict-

ing data science sociotechnical execution challenges

by categorizing data science projects. Journal of the

Association for Information Science and Technology,

68(12):2720–2728.

Saltz, J. S. and Krasteva, I. (2022). Current approaches for

executing big data science projects - a systematic lit-

erature review. PeerJ Computer Science, 8(e862).

Saltz, J. S. and Shamshurin, I. (2016). Big data team pro-

cess methodologies: A literature review and the iden-

tiﬁcation of key factors for a project’s success. In

2016 IEEE International Conference on Big Data (Big

Data), pages 2872–2879. IEEE.

Schulz, M., Neuhaus, U., Kaufmann, J., Badura, D.,

Kuehnel, S., Badewitz, W., Dann, D., Kloker, S.,

Alekozai, E. M., and Lanquillon, C. (2020). Introduc-

ing DASC-PM: A Data Science Process Model. Aus-

tralasian Conference on Information Systems 2020.

Schumann, C., Zschech, P., and Hilbert, A. (2016). Das auf-

strebende Berufsbild des Data Scientist. HMD Praxis

der Wirtschaftsinformatik, 53(4):453–466.

Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips,

T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-

F., and Dennison, D. (2015). Hidden Technical Debt

in Machine Learning Systems. Advances in Neural

Information Processing Systems, 28.

Sharma, R., Mithas, S., and Kankanhalli, A. (2014).

Transforming decision-making processes: a research

agenda for understanding the impact of business ana-

lytics on organisations. European Journal of Informa-

tion Systems, 23(4):433–441.

Shollo, A., Hopf, K., Thiess, T., and M

uller, O. (2022).

Shifting ML value creation mechanisms: A process

model of ML value creation. The Journal of Strategic

Information Systems, 31(3):101734.

Sonnenberg, C. and vom Brocke, J. (2012). Evaluations

in the Science of the Artiﬁcial – Reconsidering the

Build-Evaluate Pattern in Design Science Research.

Design Science Research in Information Systems. Ad-

vances in Theory and Practice. DESRIST 2012., Vol.

7286:381–397.

Symeonidis, G., Nerantzis, E., Kazakis, A., and Papakostas,

G. A. (2022). MLOps - Deﬁnitions, Tools and Chal-

lenges. In 2022 IEEE 12th Annual Computing and

Communication Workshop and Conference (CCWC),

pages 0453–0460. IEEE.

Thiess, T. and M

uller, O. (2018). Towards Design Prin-

ciples for Data-Driven Decision Making - An Ac-

tion Design Research Project in the Maritime Industry.

ECIS 2018 Proceedings.

VentureBeat (2019). Why do 87% of data science projects

never make it into production?

Wamba, S. F., Gunasekaran, A., Akter, S., Ren, S. J.-f.,

Dubey, R., and Childe, S. J. (2017). Big data analytics

and ﬁrm performance: Effects of dynamic capabili-

ties. Journal of Business Research, 70.

Elevating Data Science Maturity: Toward a Process Model that Harnesses MLOps

557