This paper is composed as follows: next section
is a short review of the literature dealing with sim-
ilar approaches; third, the methodology is detailed;
fourth, the proposed local-model-based methodology
is applied to real industrial data and is confronted to a
direct global model; last section concludes this paper.
2 STATE OF THE ART
In Industry 4.0, three areas of research may be distin-
guished: the hyper-connectivity of the units, the dig-
ital twin and the cognitive plant. Thus far, most of
the work related to Industry 4.0 has dealt with the two
first topics, but greatly less with the third, and partic-
ularly automatic modeling of industrial systems.
On that topic, the works available are general, not
oriented toward Industry 4.0, or just discuss the future
benefits of such approaches. For instance, (Cohen
et al., 2017) explains how hyper-connectivity coupled
with Artificial Intelligence should help improve effi-
ciency, with a real use-case provided as example, but
only theoretically. Similarly, (Chukalov, 2017) dis-
cusses the benefits of using a CPS for centralized con-
trol and how to integrate it within actual industry.
Maybe the closest work to the proposed one is
(Baduel et al., 2018), who discusses the notion of reg-
ular states and modes (behaviors), and proposes uni-
fied definitions of these concepts, in the scope of test-
ing, simulation and validation, with a real use-case
provided as example; however, there is no true imple-
mentation of modeling of any sort.
Finally, it is worth mentioning (Thiaw, 2008), who
expands the notion of multi-model to nonlinear dy-
namic systems. The proposed concepts are applied to
model and predict the evolution of a real river flow,
by building a multi-model and training it over histor-
ical values. The proposed multi-model based predic-
tor splits the feature space (composed of the system’s
sensors) into several intervals (sub-spaces) using ba-
sic and simple clustering (such as grid partitioning).
The obtained clusters are used to construct the set
of models and associated membership functions com-
posing the target multi-model. The final prediction is
achieved by using polynomial interpolation of outputs
of constituent models. The accurate obtained results
make the proposed approach appealing for modeling
complex plants within the context of Industry 4.0.
Additionally, a few European projects dealing
with the future cognitive plant are worth being intro-
duced. The project COGNIPLANT proposes to use
the concept of the digital twin to create a virtualiza-
tion of real plants, i.e. a fully virtual model, in which
the control, management and modeling could be per-
formed to help the end-users (Ellinger et al., 2023).
Another project is INEVITABLE, which aims to im-
prove the control one has on the production chain by
extracting as much information as possible from un-
labeled sensor data, such as by using Bayesian opti-
mization (Toma
ˇ
zi
ˇ
c et al., 2022).
Finally, another project which should be discussed
is HyperCOG, whose purpose is to investigate the fea-
sibility of the cognitive plant. To that end, fourteen
partners are gathered in the development of intelli-
gent, Machine Learning-based solutions, integrated
into a Cyber-Physical System (Huertos et al., 2021).
This paper and the proposed methodology belong to
that project, and propose a new fashion to model real
dynamic processes in an Industry 4.0 context.
3 METHODOLOGY
This paper proposes to evaluate the benefits of local
modeling in an industrial context. To that end, the
feature space is split into pieces, and every region is
then modeled separately. The final model is obtained
by linking the local models, which are later triggered
by some inputs to provide the corresponding outputs.
Notice that this paper aims to assess if multi-
modeling is suited for the prediction of industrial pro-
cesses in the context of Industry 4.0, not that cluster-
ing can isolate anomalies (see (Molini
´
e et al., 2022)).
3.1 Split of the Feature Space
Even though industrial processes evolve through time,
they are not directly dependent of time itself, since
the processes should evolve the same way when fed
with a same material. Therefore, the space where to
perform region partitioning will be the N-dimensional
feature space spanned by the N sensors of the system.
Clustering consists in gathering data sharing simi-
lar features, while also isolating such groups from one
another: the goal is to find the best borders between
the groups so as to minimize an error function.
In order to ensure the maximal generalization
capability, the proposed methodology operates in a
blind context, assuming no prior information on the
dataset; therefore, only unsupervised clustering can
be considered, such as the K-Means (Lloyd, 1982)
or the Self-Organizing Maps (Kohonen, 1982). With
respect to the observations drawn in (Molini
´
e and
Madani, 2022), this study will use the Bi-Level Self-
Organizing Maps (BSOMs), for they proved to be
more accurate in the identification of an unknown sys-
tem’s behaviors than both the SOMs and K-Means,
and are resilient to outliers and sporadic events.
Behavioral Modeling of Real Dynamic Processes in an Industry 4.0-Oriented Context
511