Towards the Automation of Industrial Data Science:

A Meta-learning based Approach

Moncef Garouani

1,2,3

, Adeel Ahmad

1 a

, Mourad Bouneffa

, Arnaud Lewandowski

Gregory Bourguin

and Mohamed Hamlich

Univ. Littoral C

ote d’Opale, UR 4491, LISIC, Laboratoire d’Informatique Signal et Image de la C

ote d’Opale,

F-62100 Calais, France

ISSIEE Laboratory, ENSAM, University of Hassan II, Casablanca, Morocco

Study and Research Center for Engineering and Management(CERIM), HESTIM, Casablanca, Morocco

Keywords:

Automated Machine Learning, Manufacturing Big Data, Industry 4.0, Industrial Data Science, Meta-learning.

Abstract:

In context of the fourth industrial revolution (industry 4.0), the industrial big data is subject to grow rapidly to

respond the agile industrial computing and manufacturing technologies. This data evolution can be captured

using ubiquitous integrated sensors and multiple smart machines. We believe the use of data science method-

ologies, for the selection of models and conﬁguration of hyper-parameters, may help to better control such

data evolution. But, at the same time, the industrial practitioners and researchers often lack machine-learning

expertise to directly retrieve the beneﬁt from valuable manufacturing big data. Such a lack poses the major

obstacle to yield value from even-though familiar data. In this case, a collaboration with data scientists may

become an exigence along with the extensive machine learning knowledge which presumably may result to

pursue further delays and effort. Multiple approaches for automating machine learning (AutoML) have been

proposed for the past recent years in order to alleviate this deﬁciency. These approaches are expected to per-

form better along with accomplishment of computing resources which are mostly not readily accessible. To

address this research challenge, in this paper, we propose a meta-learning based approach that may serve an

effective decision support system for the AutoML process.

1 INTRODUCTION

Advanced analytics offers new opportunities to im-

prove and innovate manufacturing processes (Wolf

et al., 2019). The recent advances, in terms of storage

capacity, computing power as well as the rapid devel-

opment of advanced analytics solutions, have offered

manufacturing industries the unprecedented possibil-

ities to extract knowledge and business value from

large datasets (Wang et al., 2018). It may provide

means to achieve the highest predictive performances

instead of traditional predictive modeling approaches.

The use of advanced machine learning (ML) methods

can play a signiﬁcant role in production design, qual-

ity management, scheduling, etc.

The current competitive environment, with im-

proved availability, sustainability, and quality of man-

ufacturing services in smart factories, has already

https://orcid.org/0000-0003-0132-3808

triggered the requirement of using Artiﬁcial Intelli-

gence (AI) solutions to streamline complex operations

while improving quality and reducing costs (Thoben

et al., 2017; Wuest et al., 2016). The manufac-

turing sector can beneﬁt greatly from the use of

advanced analytics since data is abundantly avail-

able (Wolf et al., 2019). Recent manufacturing strate-

gies, such as Industry 4.0 in Germany, Industrial In-

ternet in the United States, and the Made in China

2025 initiative, recognize the crucial importance of

utilizing data in order to enhance manufacturing com-

petitiveness (Thoben et al., 2017; Tao et al., 2018).

However, the manufacturing industry is not exploit-

ing the full potential of data analytics. We observe

the following reasons for such a lack:

• Complexity of unifying the data analytics and mi-

cro services,

• Lack of reliable data ingestion chains,

• Lack of collaboration tools between business

Garouani, M., Ahmad, A., Bouneffa, M., Lewandowski, A., Bourguin, G. and Hamlich, M.

Towards the Automation of Industrial Data Science: A Meta-learning based Approach.

DOI: 10.5220/0010457107090716

In Proceedings of the 23rd International Conference on Enterprise Information Systems (ICEIS 2021) - Volume 1, pages 709-716

ISBN: 978-989-758-509-8

709

managers, data engineers and data-scientists.

The domain experts (business managers and data en-

gineers) are not necessarily competent to perform data

analysis using some data mining techniques. These

users often resort to the default implementations or

commit to collaborations with machine-learning ex-

perts which are mostly complex and time consum-

ing. The process itself, also known as knowledge dis-

covery, consists of several steps such as data prepro-

cessing, features engineering, model selection, hyper-

parameters tuning and ﬁnally model validation or in-

terpretation. The algorithm selection phase is among

the most important steps along with the conﬁguration

of its related hyper-parameters, hence, the conduction

of related tasks can make them less feasible for the

non-experts.

The lack of decision support tools is also a ma-

jor barrier that may prevent the manufacturing actors,

as well as researchers, from harnessing the maximum

potential of data analytics. The decision support tools

can help to rendre the most suitable data analytic tech-

nique from a wide range of possible choices and con-

ﬁgurations (Zacarias et al., 2018). While, the avail-

able literature has brought forward various techniques

capable of solving the manufacturing complex prob-

lems, whereas, the necessary decision support tools

to implement these techniques on an operative level

are yet not sufﬁcient to equip practitioners, decision-

makers, and researchers for better data-analytics.

The work, in this paper, hence attempts an ap-

proach towards overwhelming this obstacle. The

main objective of the proposed approach remains

however to assist the industrial practitioners and re-

searchers in data-science with the help of a recom-

mendation system. This recommendation encompass

the most convenient machine-learning algorithm and

its related hyper-parameters conﬁguration that shall

ultimately give the best result of the analysis for the

domain-speciﬁc problems. In order to achieve that,

we make use of the concept of meta-learning (Brazdil

et al., 2008), which consists of two main phases;

which are learning phase and recommendation phase.

For a given dataset and a predictive metric, we suggest

the ML algorithms and their related hyper-parameters

conﬁguration that once applied yield the best classiﬁ-

cation performance (e.g., predictive accuracy, Recall,

F1 score).

The rest of the paper is organized as follows: Sec-

tion 2 presents a brief review of the related works

in respect of theoretical background about meta-

learning as an AutoML solution. Section 3 provides

an overview of the proposed Meta-Learning based

framework and methodology for supporting the op-

timization decisions of the AutoML process. We

discuss the results and feasibility of the proposed

methodology in Section 4. We evaluate the proposed

approach with the help of an empirical study in sec-

tion 5. Finally, we conclude the contents of the paper

in Section 6 .

2 RELATED WORKS

A careful literature review reveals that the machine-

learning methods have been proposed as a key tech-

nology for the industrial data analytics. Many re-

searchers have focused on the instantiation of Data

Science methods to retrieve beneﬁts in industry 4.0

automation. However, their main concern has been

regarding the ingestion of new data in data lakes to

establish compliance with the governance rules for

the business managers with or without the assistance

of data engineers. Similarly, the data scientists have

been concerned with the capabilities to collaborate

and deploy, independently, their models in produc-

tion.

Similarly, a lot of work has been done in re-

cent years to add explanations to Artiﬁcial Intel-

ligence (AI) systems (Samek and M

uller, 2019; De

et al., 2020; Bohanec et al., 2017), in particular those

based on deep neural networks, which are mostly very

efﬁcient, but also in principle designed and imple-

mented like black boxes. The explanation of reason-

ing has always been felt since the emergence of de-

cision support systems (De et al., 2020; Shin, ) but it

is now more desirable to re-assure legitimate conﬁ-

dence on machine learning applications, particularly

in real-time systems (such as the industry 4.0 applica-

tions). The need for explainability in machine learn-

ing requires the development of interrogeable infor-

mation systems that must allow the transparency of

involved concepts according to the level of abstrac-

tion of the concerned actors. The innovation of AI

has been mostly characterized by the algorithmic ad-

vancements with respect to efﬁcient data analysis.

The aspects of meta-learning (to ﬁnd out the factors

that play a more critical role to better use limited

data) are usually given less focus while the attention

is given to the more computational performance and

more training data.

We constrain the focus of current work on meta-

learning technique due to available space limitation

for the contents. Meta learning is an aspect of

monitoring the progress of machine learning pro-

cesses. Meta-learning techniques provide the method-

ologies to observe the performance of different ma-

chine learning models according to the target out-

come and its correlation to the meta-data (learning

ICEIS 2021 - 23rd International Conference on Enterprise Information Systems

710

experience). This notion is often referred to the un-

derstanding of the reasoning process and involved

parameters while the deduction of target outcome.

But in its very nature meta-learning differs from

meta-reasoning. The fundamental objective of meta-

learning in current machine learning literature is to

enhance the performance of machine learning mod-

els through learning experience. Learning experience

in this regard, concerne the exploitation of the neu-

ral architectures (machine learning pipelines) to elim-

inate the less-worthy intermediate decision points.

The meta-learning in this aspect helps to evaluate

the machine learning model with respect to the ac-

curacy and learning time. It requires transparent

model executions which involve the awareness of

meta-features (algorithm conﬁgurations, network ar-

chitecture, pipeline compositions, etc.) used to train

the model.

Meta-learning or learning to learn, is a commonly

used process that supports automation in data min-

ing tools selection and conﬁguration (Brazdil et al.,

2008). It is a method that consist of observing

relationships between dataset characteristics (Meta-

features) and data mining algorithms performances.

Later for a given unseen dataset the system should be

able to select and rank a pool of learning algorithms

that could yield the best predictive performance ac-

cording to the expected performance metric. The

main idea of meta-learning for advanced analytics se-

lection and conﬁguration is based on the simple as-

sumption Algorithms show similar performance for

the same conﬁguration for similar problems. Particu-

larly, meta-learning paradigm is the process of under-

standing and adapting learning itself on a higher level

instead of starting from scratch, we leverage previ-

ously gained insights (Lemke et al., 2015).

A number of studies explore the application of

meta-learning in various levels. These approaches

range from automatic data pre-processing (Nargesian

et al., 2017), automatic features extraction (Bilalli

et al., 2016; Bilalli et al., 2018) to automatic model

selection and hyper-parameters tuning (Laadan et al.,

2019; Dyrmishi et al., 2019).

In (Bilalli et al., 2016), the authors propose a

meta-learning based approach for automated data pre-

processing. The authors used 28 features which are

extracted from the datasets to train a meta-model.

The meta-model is able to predict the impact of a list

of 7 data transformations strategies on the ﬁnal per-

formance of 5 classiﬁcation algorithms (Logistic Re-

gression, Naive Bayes, IBk, PART, J48). For each

dataset-algorithm pair, possible transformations are

classiﬁed as either good, bad, or neutral by the meta-

model, which corresponds to whether the transforma-

tion increases the prediction accuracy, decreases it or

doesn’t have a signiﬁcant contribution.

Recently, in (Laadan et al., 2019), the authors pro-

pose RankML a meta-learning based approach for

ML pipeline performance prediction. The RankML

produces a ranked list of all pipelines based on their

predicted performance for the given dataset, evalua-

tion metric and the set of candidate pipelines. How-

ever, this approach may not be practical in all situ-

ations, since the system asks to provide the list of

pipelines to rank, that a non-ML expert cannot pro-

duce.

Moreover, some studies (Cohen-Shapira et al.,

2019; Feurer et al., 2019) propose the use of meta-

features and learning to improve the AutoML process.

However, these frameworks do not achieve the goal

of identifying the promising analytics tools as well as

conﬁgurations as a prompt and powerful support for

the manufacturing application areas in the ﬁrst place.

Therefore, they are not suitable for decision-making

at the managerial level (Zacarias et al., 2018).

3 FRAMEWORK OF THE

PROTOTYPE OF VALIDATION

The framework in its actual form includes two inde-

pendent main phases; which are the Learning phase

and the Recommendation phase. The high-level ar-

chitectural description of prototype of the proposed

solution is illustrated in Fig. 1. We discuss in detail

the different components of the framework’s life cy-

cle, in the following sub-sections.

3.1 Learning Phase

The learning phase is performed ofﬂine and consists

of two main steps. In the ﬁrst step a meta-dataset

is established. We extracted 42 dataset characteris-

tics (meta-features)–detailed in section 4.3 from each

dataset. Furthermore, on each dataset, we executed 8

classiﬁcation algorithms (meta-learners), by generat-

ing different predictive performance metrics (predic-

tive accuracy, Recall, precision, F1-score) values. We

primarily evaluate that with a 5-fold stratiﬁed cross

validation strategy. For each data mining algorithm,

we obtained a meta-dataset that is fed to the meta-

knowledge base.

Dataset characteristics and performance measures

altogether are referred to as metadata. In the second

step, meta-learning is performed on top of the meta-

knowledge base. As a result, a predictive meta-model

is generated that can be used to predict a ranked list

Towards the Automation of Industrial Data Science: A Meta-learning based Approach

711

Figure 1: The workﬂow of the proposed framework.

of a classiﬁcation algorithms and their related hyper-

parameters conﬁguration on any new dataset.

3.2 Recommendation Phase

In the recommendation phase, when a user wants to

analyze a new dataset, she selects a predictive metric

to be used for the analysis and then the system auto-

matically recommends a machine-learning algorithm

and its related hyper-parameters conﬁguration to be

applied such that the predictive performance is the

ﬁrst-rate. In order to do that, the system ﬁrst extracts

the dataset meta-features through the meta-features

executor module. Then, the extracted meta-features

are fed to the meta-model to provide the candidate

pipelines. Finally, the suggestion engine, according

to the meta-knowledge base, rank the pipelines in re-

spect to the provided metric. The recommendation

phase algorithm is listed as follows:

4 IMPLEMENTATION

ARCHITECTURE

In this section, we discuss the implementation of the

proposed approach into a prototype solution. As we

already know, the principal phases of the proposed

frame work are Learning phase and Recommendation

phase; these are implemented independently of each

other. In the following, we give the detailed descrip-

tion for each of these phases.

4.1 Datasets

We collected 200 real-world manufacturing classiﬁ-

cation datasets. These have been collected from the

Algorithm 1: The recommendation phase algorithm.

I n p u t :

D = The D a t a s e t

M = P r e d i c t i v e m e t r i c

O u t pu t :

P<P

, P

, . . . , P

> / / The s u g g e s t e d p i p e l i n e s

Method :

Be gi n

/ / c h a r a c t e r i z e t h e new d a t a s e t D

(Metafeatures extraction)

←−−−−−−−−−−−−− M e t a f e a t u r e s (D)

/ / s e l e c t n e a r e s t n e i g h b o r fro m t h e

/ / meta−d a t a s e t

N ei g h b or s ←− min(distance(Y

, Y

)

i=1

/ / Y

( a c t u a l l y 42 ) i s t h e v e c t o r o f t h e

/ / m a s t h e s i z e ( a c t u a l l y 20 0)

/ / o f t r a i n i n g d a t a s e t m e t a f e a t u r e s

/ / c h a r a c t e r i z e t h e c a n d i d a t e n e i g h b o r s

FT ←− N e i g hb o rs M e ta da t a

/ / S u g ge s t e d p i p e l i n e s r a n k i n g o f

/ / t h e new d a t a s e t D

/ / b a se d on t h e p ro v i d e d p e r f o r ma n c e c r i t e r i a

Γ = S u g g e s t i o n E n g i n e ( P

, P

, . . . , P

)

End

popular UCI

, OpenML

, Kaggle

repositories along

with some other real-world scenarios which were

used in the learning phase. These datasets represent a

mix of binary (54%) and multi-class (46%) classiﬁca-

tion tasks.

Although not limited to the problems in machines

level, the data set includes many manufacturing clas-

siﬁcation problems, including tasks such as predictive

maintenance, anomaly detection, shop ﬂoor applica-

tions, among others.

https://archive.ics.uci.edu

https://www.openml.org

https://www.kaggle.com

ICEIS 2021 - 23rd International Conference on Enterprise Information Systems

712

4.2 Meta-learners

In the context of this work, we used 8 popular ML al-

gorithms from scikit-learn, a widely used ML library

implemented in Python. Each algorithm and its re-

lated hyper-parameters are described in Table 1.

4.3 Meta-features

The used meta-features should well describe the

datasets to create an effective meta-model able to rec-

ommend the most suitable pipeline with a high pre-

cision. The features should be good predictors of

the relative performance of algorithms. Several cate-

gories of meta-features have been developed. These

range from simple features such as the number of

instances in a dataset to more complex ones. Most

meta-features belong to one of the following cate-

gories (Vilalta et al., 2004):

4.3.1 Simple, Statistical and

Information-theoretic

Simple features can be rapidly extracted such as the

number of instances, attributes and classes in the

dataset. To some extent, they are designed to measure

the complexity of the underlying problem. Statistical

and Information-theoretic features are designed to de-

scribe the numerical properties of data distribution in

a dataset sample and informations about the numeric

features such as the class entropy, mean skewness of

attributes.

4.3.2 Landmarking

It characterizes the extent of datasets when basic ma-

chine learning algorithms are performed on them.

Some of the examples include the performance of a

Decision Trees (DT), Gaussian Naive Bayes (GNB) or

Linear Discriminant Analysis landmarker (LDA).

4.4 Meta-model

We used the k-Nearest Neighbor (k-NN) algorithm

to induce meta-model able to predict top performer

pipeline. It is often used in recommendation sys-

tems based on meta-learning (Laadan et al., 2019;

Dyrmishi et al., 2019). After identifying the clos-

est neighbors of the dataset using a distance metric

such as “Euclidean Distance”, a weighted average of

each individual neighbor’s actual ranking is used for

computing the candidate dataset’s predicted ranking

of modeling algorithms based on the relevant metric.

Thus, when the meta-learning system is applied to a

new dataset, the suggestion engine returns a list of

the most suitable pipelines, based on the meta-feature

values extracted from the dataset and the evaluation

metric.

5 EVALUATION

We performed an experimental study to evaluate the

performance that can be achieved by using the pro-

posed approach on various manufacturing related

problems. After specifying the experimental environ-

ment, we evaluate the systems ability to predict the

ML algorithms with its hyper-parameters conﬁgura-

tion that shall provide the best result of the analysis.

We benchmark on a highly varied selection of 20

more curated datasets to ensure meaningful evalua-

tion. It covers binary and multi-class classiﬁcation

problems from different industry 4.0 levels. These

data are gathered from state of the art papers deal-

ing with industry 4.0 related problems using machine-

learning solutions as described in the Table 2. It is im-

portant to note that, the selected data sets were never

exploited by any learning method on the ofﬂine phase.

The proposed system exploits the meta-model to

predict all the pipelines in the meta-knowledge base

with respect to the analyzed dataset and then returns

its top-ranked pipelines according to the provided per-

formance criteria. These pipelines are then ﬁtted on

the datasets train set and evaluated on the test set us-

ing the 70% / 30% splitting ratio.

The performance of the proposed system is com-

parable to the results of the datasets treated by the

related papers. Lets us explain this with the help of

data in Table 3. The ﬁrst column of the table cites the

original papers from which we borrow the datasets for

testing and comparison purpose. The second column

gives the recommended ML conﬁgurations (ML algo-

rithm and its related hyper-parameters conﬁgurations)

results generated by the proposed model on same

datasets. The third column indicates the achieved ac-

curacy of each dataset on the related original paper.

The fourth column shows the evaluation results of ap-

plying the default conﬁguration on the recommended

ML algorithm. Evidently, as shown in Table 3, the ob-

tained results are more accurate than the results from

the related papers. It can be observed that some ma-

chine learning oriented manufacturing works could be

improved simply through the use of a better ML algo-

rithm conﬁguration.

The illustrated results reveal the effectiveness of

the AutoML solution in manufacturing data mining

processes. The suggested model conﬁgurations ex-

hibit better performance than the classic supervised

learning techniques with default hyper-parameters

Towards the Automation of Industrial Data Science: A Meta-learning based Approach

713

Table 1: ML algorithms and related hyperparameters tuned in the experiments.

ML algorithm Hyper-parameters

Support Vector Classiﬁer (SVC) C: range (1e-10, 500)

gamma: range (0.001, 1.01)

kernel: [’poly’, ’rbf’]

degree: [2, 3]

coef0: range (0., 10.)

AdaBoost (AB) max depth: range(1, 11)

algorithm: [’SAMME’, ’SAMME.R’]

n estimators: range(50, 501)

learning rate: [0.01, 2]

Gradient Boosting (GB) learning rate: [0.01, 1]

criterion’: [’friedman mse’, ’mse’]

n estimators: range (50, 501)

max depth: range (1, 11)

min samples split: range (2, 21)

min samples leaf: range (1, 21)

max features: [0.1, 0.9]

Extra Trees (ET) & Random Forest (RF) n estimators:[100]

bootstrap: [True, False]

max features: range (0.1, 0.9)

min samples leaf: range (1, 21)

min samples split: range (2, 21)

criterion: [’entropy’, ’gini’]

Decision Tree (DT) max features: range (0.1, 0.9)

min samples leaf: range (1, 21)

min samples split: range (2, 21)

criterion: [’entropy’, ’gini’]

Logistic Regression (LR) C: range (1e-10, 10.)

penalty: [’l2’, ’l1’]

ﬁt intercept: [True, False]

Stochastic Gradient Descent (SGD) loss: [’hinge’, ’log’, ’modiﬁed huber’,

’squared hinge’, ’perceptron’]

penalty:[’l2’, ’l1’, ’elasticnet’]

learning rate:[’constant’, ’optimal’, ’invscaling’]

ﬁt intercept: [True, False]

l1 ratio: range (0., 1.)

eta0: range (0., 5.)

power t: range (0., 5.)

settings and the conﬁgurations by non-ML experts, as

shown in Figure 2.

6 CONCLUSIONS

In this paper, we studied the effectiveness of auto-

mated machine-learning techniques for the selection

and parametrization of ML for the problems more of-

ten related to manufacturing industry. The main ob-

jective of the current work has been focused towards

the design of a decision support system in order to en-

able the non-expert practitioners and data engineers,

prospectively in the domain of industry 4.0 to take

maximum beneﬁt of ML models. The proposed ap-

proach validates the automated selection of ML mod-

els and suggest the optimized hyper-parameters for

their conﬁgurations. The contents of the paper brieﬂy

describe the potential use of automatic machine learn-

ing methods in the 4th industrial revolution ﬁeld. The

proposed approach eventually aims to improve the

conﬁdence level of industrial practitioners to specify

the appropriate conﬁguration in the ML tools as well

as to improve the reliability generalizing the high-risk

and dynamic manufacturing environment.

In the current work, we mainly focus on the clas-

ICEIS 2021 - 23rd International Conference on Enterprise Information Systems

714

Table 2: The sample list of datasets used in the evaluation.

Dataset Number Number Task

of Classes of Instances

(Mazumder et al., ) 4 959 Failure risk analysis of pipeline

networks

(Benkedjouh et al., 2015) 2 61000 RUL prediction

(Saravanamurugan et al., 2017) 3 2000 Chatter prediction

(Costa and Nascimento, 2016) 2 60000 APS system failure prediction

(Baldi et al., 2014) 2 98050 high-energy physics data analyses

(Tian et al., 2015) 7 1941 Faults detection

Table 3: The comparative performance analysis of the proposed framework.

Dataset Recommended Original paper ML pipeline with

conﬁguration result result default conﬁguration

(Mazumder et al., ) 93.74 85 80.24

(Benkedjouh et al., 2015) 99.41 98.95 93.88

(Saravanamurugan et al., 2017) 97.06 95 86.12

(Costa and Nascimento, 2016) 99.10 92.56 92.34

(Baldi et al., 2014) 85.59 88 69.45

(Tian et al., 2015) 99.54 80.74 76.23

Figure 2: Comparative results of the effectiveness of AutoML over default classic ML conﬁgurations and domain ex-

pert (industrial researchers) conﬁgurations.

siﬁcation problems for the sake of clarity of the ex-

perimental evaluation. The results obtained from

the proposed framework evidently exhibits improved

performance of the automated selection of ML al-

gorithms and optimization of hyper-parameters in-

stead of the default values in manufacturing applica-

tions. The comparative analysis reveals in the ma-

jority of the cases that the recommended conﬁgura-

tions yield better performances to enhance the utility

of ML methods for the domain experts, notably the

non-ML experts. It, thence, can be observed that Au-

toML paradigms and tools may help manufacturing

practitioners—both neophyte and experts in the ﬁeld

of data analysis.

Moreover, we believe that such tools should not

lack transparency while using the AI models; the per-

formance of which, most of the times comes from

the black box algorithms. It is essential to incorpo-

rate features that enable the interpretability and ex-

plainability of the produced results to a certain extent.

We, hence, endeavor in the future works, the better

understanding of the recommended conﬁguration to

gain more conﬁdence on the rational of the obtained

results. It shall assure the adoption of the proposed

solution for the real-time systems even in critical sit-

uations.

Towards the Automation of Industrial Data Science: A Meta-learning based Approach

715

ACKNOWLEDGEMENTS

This work has been supported, in part, by Hestim,

CNRST Morocco, and the Universit

e du Littoral C

ote

d’Opale, Calais France.

REFERENCES

Baldi, P., Sadowski, P., and Whiteson, D. (2014). Searching

for exotic particles in high-energy physics with deep

learning. Nature communications, 5(1):1–9.

Benkedjouh, T., Medjaher, K., Zerhouni, N., and Rechak, S.

(2015). Health assessment and life prediction of cut-

ting tools based on support vector regression. Journal

of Intelligent Manufacturing, 26(2):213–223.

Bilalli, B., Abell

o, A., Aluja-Banet, T., Munir, R. F.,

and Wrembel, R. (2018). Presistant: data pre-

processing assistant. In International Conference on

Advanced Information Systems Engineering, pages

57–65. Springer.

Bilalli, B., Abell

o, A., Aluja-Banet, T., and Wrembel, R.

(2016). Automated data pre-processing via meta-

learning. In International Conference on Model and

Data Engineering, pages 194–208. Springer.

Bohanec, M., Bor

stnar, M. K., and Robnik-

Sikonja, M.

(2017). Explaining machine learning models in

sales predictions. Expert Systems with Applications,

71:416–428.

Brazdil, P., Carrier, C. G., Soares, C., and Vilalta, R. (2008).

Metalearning: Applications to data mining. Springer

Science & Business Media.

Cohen-Shapira, N., Rokach, L., Shapira, B., Katz, G., and

Vainshtein, R. (2019). Autogrd: Model recommenda-

tion through graphical dataset representation. In Pro-

ceedings of the 28th ACM International Conference

on Information and Knowledge Management, pages

821–830.

Costa, C. F. and Nascimento, M. A. (2016). Ida 2016 indus-

trial challenge: Using machine learning for predicting

failures. In International Symposium on Intelligent

Data Analysis, pages 381–386. Springer.

De, T., Giri, P., Mevawala, A., Nemani, R., and Deo, A.

(2020). Explainable ai: A hybrid approach to gener-

ate human-interpretable explanation for deep learning

prediction. Procedia Computer Science, 168:40–48.

Dyrmishi, S., Elshawi, R., and Sakr, S. (2019). A deci-

sion support framework for automl systems: A meta-

learning approach. In 2019 International Conference

on Data Mining Workshops (ICDMW), pages 97–106.

IEEE.

Feurer, M., Klein, A., Eggensperger, K., Springenberg,

J. T., Blum, M., and Hutter, F. (2019). Auto-

sklearn: efﬁcient and robust automated machine learn-

ing. In Automated Machine Learning, pages 113–134.

Springer, Cham.

Laadan, D., Vainshtein, R., Curiel, Y., Katz, G., and

Rokach, L. (2019). Rankml: a meta learning-based

approach for pre-ranking machine learning pipelines.

arXiv preprint arXiv:1911.00108.

Lemke, C., Budka, M., and Gabrys, B. (2015). Metalearn-

ing: a survey of trends and technologies. Artiﬁcial

intelligence review, 44(1):117–130.

Mazumder, R. K., Salman, A. M., and Li, Y. Failure risk

analysis of pipelines using data-driven machine learn-

ing algorithms. Structural Safety, 89:102047.

Nargesian, F., Samulowitz, H., Khurana, U., Khalil, E. B.,

and Turaga, D. S. (2017). Learning feature engineer-

ing for classiﬁcation. In IJCAI, pages 2529–2535.

Samek, W. and M

uller, K.-R. (2019). Towards explainable

artiﬁcial intelligence. In Explainable AI: interpreting,

explaining and visualizing deep learning, pages 5–22.

Springer.

Saravanamurugan, S., Thiyagu, S., Sakthivel, N., and Nair,

B. B. (2017). Chatter prediction in boring process us-

ing machine learning technique. International Journal

of Manufacturing Research, 12(4):405–422.

Shin, D. The effects of explainability and causability

on perception, trust, and acceptance: Implications

for explainable ai. International Journal of Human-

Computer Studies, 146:102551.

Tao, F., Qi, Q., Liu, A., and Kusiak, A. (2018). Data-driven

smart manufacturing. Journal of Manufacturing Sys-

tems, 48:157–169.

Thoben, K.-D., Wiesner, S., and Wuest, T. (2017). “indus-

trie 4.0” and smart manufacturing-a review of research

issues and application examples. International jour-

nal of automation technology, 11(1):4–16.

Tian, Y., Fu, M., and Wu, F. (2015). Steel plates fault diag-

nosis on the basis of support vector machines. Neuro-

computing, 151:296–303.

Vilalta, R., Giraud-Carrier, C. G., Brazdil, P., and Soares, C.

(2004). Using meta-learning to support data mining.

IJCSA, 1(1):31–45.

Wang, J., Ma, Y., Zhang, L., Gao, R. X., and Wu, D. (2018).

Deep learning for smart manufacturing: Methods

and applications. Journal of Manufacturing Systems,

48:144–156.

Wolf, H., Lorenz, R., Kraus, M., Feuerriegel, S., and Net-

land, T. H. (2019). Bringing advanced analytics to

manufacturing: A systematic mapping. In IFIP Inter-

national Conference on Advances in Production Man-

agement Systems, pages 333–340. Springer.

Wuest, T., Weimer, D., Irgens, C., and Thoben, K.-D.

(2016). Machine learning in manufacturing: advan-

tages, challenges, and applications. Production and

Manufacturing Research, 4(1):23–45.

Zacarias, A. G. V., Reimann, P., and Mitschang, B. (2018).

A framework to guide the selection and conﬁguration

of machine-learning-based data analytics solutions in

manufacturing. Procedia CIRP, 72:153–158.

ICEIS 2021 - 23rd International Conference on Enterprise Information Systems

716