Software Requirements Prioritisation Using Machine Learning

Arooj Fatima

, Anthony Fernandes, David Egan

and Cristina Luca

School of Computing and Information Science, Anglia Ruskin University, Cambridge, U.K.

Keywords:

Software Requirement Prioritisation, Machine Learning, Classiﬁcation, Requirements Analysis.

Abstract:

Prioritisation of requirements for a software release can be a difﬁcult and time-consuming task, especially

when the number of requested features far outweigh the capacity of the software development team and difﬁ-

cult decisions have to be made. The task becomes more difﬁcult when there are multiple software product lines

supported by a software release, and yet more challenging when there are multiple business lines orthogonal

to the product lines, creating a complex set of stakeholders for the release including product line managers and

business line managers. This research focuses on software release planning and aims to use Machine Learning

models to understand the dynamics of various parameters which affect the result of software requirements

being included in a software release plan. Five Machine Learning models were implemented and their perfor-

mance evaluated in terms of accuracy, F1 score and K-Fold Cross Validation (Mean).

1 INTRODUCTION

Software can be found in very diverse applications

(Sommerville, 2016) and embedded software now

makes up a large proportion of that software (Pohl

et al., 2005, p14). With the recent development of

applications for the Internet of Things (IoT) (Ashton,

2009), software is used to unlock the ability of ba-

sic hardware to service multiple applications. There

are different sets of challenges associated with soft-

ware depending on the business and technical envi-

ronment. This paper explores one of these challenges:

the prioritisation of software requirements for soft-

ware releases, with particular focus on a complex use

case of a company that produces wireless microchips

for the IoT. When conﬂicting demands for additional

features arise and there are insufﬁcient resources for

development, prioritisation of software requirements

becomes very important.

This study reviews the strategy of prioritising

tasks that will take into account the requirements of

a particular class of software system and setting: soft-

ware product lines (SPL) (Devroey et al., 2017) along

with Multiple business lines (MBL) (Pronk, 2002)

and applies machine learning capabilities to the pri-

oritisation strategy. The study examines requirements

data related to a software release cycle in an IoT semi-

https://orcid.org/0000-0001-6129-9032

https://orcid.org/0000-0003-3456-2522

https://orcid.org/0000-0002-4706-324X

conductor company. This data is chosen as it is a good

example of the scale and complexity of SPL/MBL.

We seek to determine how the various inputs to the

requirements prioritisation (RP) and planning process

impact the results of the process: a set of requirements

chosen to be implemented in the release.

The rest of this paper is organized as follows: Sec-

tion 2 investigates the literature and state-of-the-art

studies on the topic; Section 3 describes the proposed

method; Section 4 outlines the results; Section 5 pro-

vides discussion about the results and experiments; ﬁ-

nally the conclusions are drawn in Section 6.

2 LITERATURE REVIEW

Prioritisation of software requirements becomes nec-

essary when there are competing requests for new

functionality with limited development resources

(Wiegers and Beatty, 2013). In this paper we anal-

yse requirements data from a speciﬁc type of software

system and context: software product lines (Metzger

and Pohl, 2014) with multiple business lines (Pronk,

2002)(SPL/MBL). SPL engineering enables a fam-

ily of products to be developed by re-using shared

assets (Metzger and Pohl, 2014), (Devroey et al.,

2017), (Montalvillo and Diaz, 2016), which in the

case of IoT may include common utilities, libraries

and pieces of source code that are re-used in multiple

software products, ensuring an efﬁcient and effective

Fatima, A., Fernandes, A., Egan, D. and Luca, C.

Software Requirements Prioritisation Using Machine Learning.

DOI: 10.5220/0011796900003393

In Proceedings of the 15th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2023) - Volume 3, pages 893-900

ISBN: 978-989-758-623-1; ISSN: 2184-433X

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

893

use of engineering time. SPL engineering is primar-

ily an engineering solution to enable tailored software

variants and to manage software product variability,

customisation and complexity (Gr

uner et al., 2020),

(Abbas et al., 2020). From an engineering point of

view, product line requirements are handled in the

domain engineering process, while business line re-

quirements are managed in the application engineer-

ing process.

The problem of prioritisation must be looked at

from the point of view of business owners and product

managers. Thus the focus is on making business de-

cisions rather than optimising operational efﬁciency,

to establish business priorities in a complex software

product environment. When planning a software re-

lease to address SPL/MBL, the challenges come from

the absolute number of requirements to be prioritised

as well as the complexity of the software release in

terms of number of product lines and number of stake-

holders. When multiple product lines are included in

a single software release, one inevitable challenge is

scale: the number of requirements increases as more

products are included in the release. A second chal-

lenge is complexity, as product lines become depen-

dent on shared assets and therefore shared require-

ments for those assets. There is also potential for de-

pendencies between requirements for different prod-

uct lines, which adds further to complexity. Addi-

tional challenges arise when multiple business lines

are involved in the process (Pronk, 2002). Building a

robust product line platform while also creating cus-

tomer or target market speciﬁc applications (Metzger

and Pohl, 2014) means satisfying a matrix of stake-

holders with inconsistent or even opposing views on

priority based on their speciﬁc product line or mar-

ket segment interest. These three challenges of scale,

complexity and inconsistency of stakeholders must be

considered by any prioritisation method that is to be

used with SPL/MBL.

Simple prioritisation methods work best when

there are small numbers of requirements to prioritise.

For instance, a simple pair-wise comparison (Sadiq

et al., 2021) which requires that each requirement is

assessed against all other requirements takes about

12 hours to execute with just 40 requirements (Carl-

shamre et al., 2001). More advanced prioritisation

and decision-making methods employ simple priori-

tisation methods as a foundation, for example the An-

alytic Hierarchy Process (Saaty, 1977) uses pair-wise

comparison.

The topic of requirements prioritisation contains

the analysis of the role software release planning

plays in software development processes, suggestions

for various RP strategies, and an expanding area of

empirical research focused on comparisons that take

beneﬁts and disadvantages into consideration. (Perini

et al., 2013) differentiate the RP techniques into ba-

sic ranking techniques, which typically permit priori-

tisation along a single evaluation criterion, and RP

methods, which incorporate ranking techniques inside

a requirement engineering process. Relevant project

stakeholders, such as customers, users, and system ar-

chitects conduct rank elicitation, which can be done

in a variety of methods. A fundamental strategy is

ranking each need in a group of candidates in accor-

dance with a predetermined standard (e.g., develop-

ment cost, value for the customer). A requirement’s

rank can be stated as either an absolute measure of

the assessment criterion for the requirement, as stated

in Cumulative voting (Avesani et al., 2015), or as

a relative position with regard to the other require-

ments in the collection, as in bubble sort or binary

search methods. A prioritising technique’s useful-

ness depends on the kind of rank elicitation. For

example, pair-wise evaluation reduces cognitive ef-

fort when there are just a few dozen criteria to be

assessed, but with a high number of needs, it be-

comes expensive (or perhaps impracticable) due to

the quadratic growth in the number of pairings that

must be evoked. The ranking produced by the var-

ious methods includes requirements listed according

to an ordinal scale (Bubble Sort, Binary Search), re-

quirements enlisted as per a rational scale (Analytical

Hierarchy Process (AHP), 100 Points), and as per or-

dinal scale (groups or classes), as in the Numerical

Assignment (Perini et al., 2013).

The scalability of these strategies is directly linked

with the proportional increase of the human effort.

The computing complexity depends also on the num-

ber of criteria (n) to be prioritised, ranging from a lin-

ear function in n for Numerical Assignment or Cu-

mulative Voting to a quadratic function for AHP. In

order to handle numerous priority criteria, more orga-

nized software requirements prioritisation approaches

employ ranking mechanisms (Perini et al., 2013).

The systematic review in (Svahnberg et al., 2010)

investigated 28 papers that dealt with strategic RP

models. 24 out of 28 models of strategic release plan-

ning were considered whereas the remaining investi-

gations are concerned with validating some of the of-

fered models. The EVOLVE-family of release plan-

ning models makes up sixteen of these. Most tech-

niques place a heavy emphasis on strict limitations

and a small number of requirements selection vari-

ables. In around 58% of the models, soft variables

have also been included. The studylacks a validation

on large-scale industrial projects.

Machine Learning (ML) based data analysis, esti-

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

894

mation and prediction techniques have grown in pop-

ularity in recent years as a result of improvements in

algorithms, computer power and availability of data.

Traditional methods of requirement prioritisation are

a cumbersome process since there can be too many

patterns to understand and program. Machine Learn-

ing has been used in many areas to analyse large

datasets and identify patterns. Once it is trained to

identify patterns in the data, it can construct an esti-

mation or a classiﬁcation model. The trained model

can detect, predict or recognise similar patterns or

probabilities.

Duan et al (Duan et al., 2009) proposes partial au-

tomation of software requirements prioritisation using

data mining and machine learning techniques. They

used feature set clustering using unsupervised learn-

ing and prioritised requirements mainly based on the

business goals and stakeholder’s concerns.

Perini et al (Perini et al., 2013) compared Case-

Based Ranking (CBRank) requirements prioritization

method (combined with machine learning techniques)

with Analytic Hierarchy Process (AHP) and con-

cluded that their approach provided better results than

AHP in terms of accuracy.

Tonella et al (Tonella et al., 2013) proposed an In-

teractive Genetic Algorithm (IGA) for requirements

prioritization and compared it with Incomplete Ana-

lytic Hierarchy Process (IAHP). They used IHAP to

avoid scalability issues with AHP and concluded that

IGA outperforms IAHP in terms of effectiveness, ef-

ﬁciency, and robustness to the user errors.

A number of other researchers also explored clus-

tering techniques with existing prioritization meth-

ods i.e. case-based ranking (Avesani et al., 2015)

(Qayyum and Qureshi, 2018)(Ali et al., 2021).

Most of the machine learning based techniques,

reviewed in this study, are based on some existing

prioritisation techniques, and partially automate the

process using different clustering methods. A require-

ments prioritization technique that fully automates the

requirements prioritization process for large scale sys-

tem with sufﬁcient accuracy is lacking.

3 PROPOSED APPROACH

We have followed a simple methodology introduced

by (Kuhn and Johnson, 2013) for their research on

predictive modelling. The methodology is a standard

process for most of machine learning projects. It in-

cludes data analysis, pre-processing of data including

feature selection, model selection including train/test

split, ﬁtting various models and tuning parameters,

and evaluation to ﬁnd the model which generalises

better than others.

The performance of the algorithms has been eval-

uated using accuracy (the percentage of correctly

classiﬁed data), speed (the amount of time needed for

computation), comprehensibility (how difﬁcult an al-

gorithm is to understand).

3.1 Dataset

This project uses real data produced by a company

in the semiconductor business producing IoT wireless

microchips. The data relates to the software require-

ments for bi-annual software release cycles for cal-

endar year 2020 (20Q2 and 20Q4). The data has 283

samples, each representing a software requirement re-

quested to be included in the software release. Each

sample has various feature values, some of which

were inputs to the original software release planning

cycle, some were outputs of that cycle and others were

calculated or derived during the release planning pro-

cess. During the original release planning cycle, these

values were considered and discussed with stakehold-

ers before the actual software release was ﬁnalised.

A key element of the original planning process

was the use of themes to abstract and collate require-

ments into cohesive business initiatives. This served

two purposes: a) reduce the number of items to be

discussed by business stakeholders; and b) provide

business stakeholders with something that they could

comprehend.

Out of three available subsets of requirements, the

most recent and focused data was selected in an at-

tempt to get the best results.

3.2 Exploratory Data Analysis

In the exploratory analysis, detailed information

about the main characteristics of the dataset is pro-

vided. The dataset has 40+ features that were care-

fully analysed. Table 1 presents a description of the

key features.

Various statistical analyses were carried out to

evaluate feature quality and predictability in relation

to the target value. They provided us with a more

thorough knowledge of the data.

The raw dataset had some inconsistencies in the

data i.e. redundant features, zero values and missing

values etc. Most of the features have multiple val-

ues for each sample which require further processing.

With respect to both zeros and missing values, the

data is inevitably incomplete for a number of reasons,

including: the process does not insist on complete

data before starting the planning cycle, secondary ver-

sions of that ﬁeld that may not be used for many re-

Software Requirements Prioritisation Using Machine Learning

895

Table 1: Exploratory Data Analysis.

Feature Description

Issue Key Unique identiﬁer for each requirement

in the Jira database.

Release

Commit-

ment

Output of prioritisation process, it has

three categories i.e. Q2 (requirement

was included), Complete (included and

completed) and any other value indicat-

ing not included.

Estimate

(wks)

The total estimated time in weeks to

complete the task. This feature was

added to the data after original priori-

tisation process.

(New)

[MoSCoW]

Stakeholder assessment of the depen-

dency of the theme on this requirement:

Must, Should, Could or Wo’nt.

(New)

MoSCoW

multiplier

Multiplier associated with MoSCoW

value.

Theme

Category

Divisor

Themes are categorised to indicate the

type of strategic or tactical initiative.

The highest ranked categories have a

divisor of 1, whereas the lower ranked

categories have higher divisors.

AOP/LTR

Theme

Rank

This is a ranking for themes based on

the lifetime revenue (LTR) linked to

that theme.

Cost cost of the requirement.

quirements.

3.3 Data Pre-Processing

A number of steps were taken to transform sample

features to make data machine processable.

Data Transformation: The numerical features

were extracted from the main dataset, special char-

acters from numerical data were removed and cat-

egorical values (such as Release commitment) were

mapped to numerical values.

Missing Values: After initial transformation of

data, the next step was to handle missing values and

null values. All rows where the data was missing

or null, were reviewed carefully. The rows were re-

moved where it was not ideal to perform feature en-

gineering to ﬁll in the missing values. Other missing

values (where the data was a numerical spread and

were suitable for feature engineering), were ﬁlled in

with the mean value of the given feature.

Calculating Feature Importance: We have used

Decision Tree classiﬁer to learn the feature impor-

tance in our dataset. To calculate the feature impor-

tance, Decision Tree model involves computing the

impurity metric of the node (feature) subtracting the

impurity metric of any child nodes. The mean de-

crease in the impurity of a feature across all trees

gives us the score of how important that feature is

(Scornet, 2020). Table 2 presents the importance

ranking for the features produced by the model.

Table 2: Feature Importance Score by Decisoon Tree.

Feature Value

Theme Category Divisor 0.483655

AOP LTR$ Theme Rank 0.191668

Cost 0.121553

Theme Value 0.054635

(New) MoSCoW Multiplier 0.046509

Reqs per Theme 0.034841

Estimate (wks) 0.034282

Dependent on 0.021122

3. Category Theme Rank 0.011735

(New) MoSCoW 2 Multiplier 0.000000

Based on the feature importance results, the

dataset was tuned. We tested our models on full

dataset as well as on tuned dataset.

3.4 Visual Analysis

Various statistical and visual analysis methods were

used to learn patterns in data and understand the rela-

tion of features to other features and the target value.

The target variable Release Commitment has cat-

egorical values which were converted to numeric data

to make two classes i.e. 1 (requirement included in

the release) and 0 (requirement not included).

An analysis of Class Distribution (see Figure 1

showed that the dataset has a moderate degree of

imbalance. Since the degree of imbalance wasn’t

too high and our aim was to learn patterns for both

classes, we chose to train our models on the true dis-

tribution.

Figure 1: Class Distribution.

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

896

The Correlation Matrix has been built to iden-

tify how features are correlated to each other. It

can be seen from Figure 2 that the Cost and Esti-

mate are highly correlated; Theme Category Divi-

sor is heavily linked with Category Theme Rank.

Applicable to, (New) MoSCoW 2 multiplier and

Theme Value2 are heavily correlated to Applica-

ble to2. Theme Value 2 is also heavily correlated

to the Applicable to and (New) MoSCoW 2 multi-

plier. Theme Value seems to be inversely correlated

to AOP/LTR$ Theme rank, LTR$ Theme rank

and Category Theme Rank. Based on these ob-

servations the features Issue Key, Release Commit-

ment, First Requested Version, (New) MoSCoW 2

Multiplier, Dependent on2, Applicable to2, Tacti-

cal Value, Applicable to, Category Theme Rank

were dropped while attempting the experiments using

tuned dataset for different models.

4 EXPERIMENTS AND RESULTS

The goal of this research was to experiment the ap-

plication of Machine Learning models to the problem

of software requirements prioritisation, to understand

the dynamic of various parameters included in a soft-

ware release plan and evaluate the results received.

The models considered for the experiment were put

through rigorous testing using the base line dataset

acquired from pre-processing techniques.

The dataset was split into 80% training and 20 %

testing data. Experiments were done in a series of

iterations, aiming to tune the dataset and improve the

results.

Five different ML models have been used for this

research - Decision Tree Classiﬁer, K-Nearest Neigh-

bours (KNN), Random Forest, Logic Regression and

Support Vector Machine. Five metrics have been used

to evaluate the ML models implemented: accuracy,

F1 score, Precision, Recall and K-Fold Cross Vali-

dation (Mean). For an overall comparison of the re-

sults, we only considered accuracy, F1 score and k-

fold cross validation mean. All the models have been

trained on the full as well as the tuned datasets.

In this section we present the results for each im-

plemented model.

4.1 Decision Tree Classiﬁer

Table 3 presents the results on full and tuned datasets

using decision tree model respectively. The accuracy

and F1 score seems to be dropped after tuning the

dataset however the K-Fold Cross Validation score is

improved for tuned dataset.

Table 3: Decision Tree - full and tuned datasets.

Performance Metric Score

full

dataset

tuned

dataset

Accuracy 0.96 0.94

F1 score 0.96 0.94

Precision 0.97 0.95

Recall 0.96 0.94

K-fold cross validation

mean

0.89 0.92

Cross validation is important metric since it can

ﬂag problems like selection bias and over-ﬁtting. De-

spite a drop in acuracy, tuning the dataset has visible

impact on cross validation score.

4.2 K-Nearest Neighbours (KNN)

Table 4 presents the results of KNN for full and tuned

datasets. The accuracy, precision, recall and F1 score

dropped after tuning the dataset however the k-fold

cross validation (mean) has increased.

Table 4: k Nearest Neighbours - full and tuned datasets.

Performance Metric Score

full

dataset

tuned

dataset

Accuracy 0.94 0.92

F1 score 0.94 0.92

Precision 0.95 0.92

Recall 0.94 0.92

K-fold cross validation

mean

0.80 0.82

4.3 Random Forest

Table 5 presents the results of Random Forest perfor-

mance on full and tuned datasets. The accuracy and

F1 score were the same after tuning the dataset how-

ever the k-fold cross validation(mean) has increased.

The precision and recall score remained the same

hence indicated that change in removal of features has

limited impact on the scores of Random Forest.

Random forest model generalised very well to the

data. We did some further experiments with this

model which are detailed in Section 5.

Software Requirements Prioritisation Using Machine Learning

897

Figure 2: Heatmap of correlation matrix.

Table 5: Random Forest - full and tuned datasets.

Performance Metric Score

full

dataset

tuned

dataset

Accuracy 0.94 0.94

F1 score 0.94 0.94

Precision 0.95 0.95

Recall 0.94 0.94

K-fold cross validation

mean

0.89 0.90

4.4 Logistic Regression

Table 6 presents the results of Logistic Regression for

full and tuned datasets. The accuracy, precision, re-

call, and F1 scores were improved after tuning the

dataset. However, the k-fold cross validation(mean)

has remained the same.

Table 6: Logistic Regression - full and tuned datasets.

Performance Metric Score

full

dataset

tuned

dataset

Accuracy 0.86 0.88

F1 score 0.86 0.87

Precision 0.87 0.90

Recall 0.87 0.88

K-fold cross validation

mean

0.76 0.76

4.5 Support Vector Machine

Table 7 presents the results of Support Vector Ma-

chine (SVM) for full and tuned datasets. It can be

seen that there was improvement in accuracy, F1

score, precision, recall and k-fold cross validation

mean after tuning the dataset.

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

898

Table 7: SVM - full and tuned datasets.

Performance Metric Score

full

dataset

tuned

dataset

Accuracy 0.87 0.88

F1 score 0.86 0.88

Precision 0.86 0.88

Recall 0.87 0.88

K-fold cross validation

mean

0.87 0.89

5 DISCUSSIONS

Accuracy, F1 score and K-Fold Cross Validation

(Mean) have been used to evaluate the ML models

implemented, with the results shown in Table 9. All

models have performed well with an accuracy score

above 80% for both tuned and full datasets. F1 score,

that is a better indicator of the model’s performance,

shows that the Logic Regression and Support Vector

Machine have performed slightly worse than the other

models. K-Nearest Neighbours, Decision Tree classi-

ﬁer and Random Forest have consistently high results

across all the 3 metrics for the evaluation. F1 score for

Decision Tree Classiﬁer is the highest however this

model is prone to overﬁtting, and this is evident by

the decrease in F1 score for the tuned dataset.

As Random Forest had promising results, we did

some further experiments with hyper parameter tun-

ing. After implementing and testing different imput-

ers such as simple and iterative, it was concluded that

simple imputer provided the best results with mean,

median strategy used. After tuning the hyper param-

eters for random forest, the results were substantially

higher (see Table 8). The only drawback is the exe-

cution time (203.0259862 seconds) for random forest

while its hyper parameter are tuned.

Table 8: Random Forest - results after tuning hyper-

parameters.

Performance Metric Score

Accuracy 1.0

F1 score 1.0

Precision 1.0

Recall 1.0

K-fold cross validation mean 0.91

To meet the project goal of understanding the im-

pact of certain parameters on the inclusion of a soft-

ware requirement in a release, there are several no-

table outcomes:

• Overall the level of accuracy in predicting require-

ments priority by using various machine learning

models is positive and indicates that there may be

value in extending this research to develop this

concept further;

• Estimate (wks) and Cost were identiﬁed as pa-

rameters that were essential for the data mod-

elling. They ranked 3rd and 7th respectively in

terms of their importance. However, the original

software requirement prioritisation process was

completed before either the estimate or cost was

derived/calculated, and they were added later in

the cycle. This could indicate that even when esti-

mate or cost information was not available, stake-

holders had an intuitive understanding of the size

of the requirement when providing their inputs to

the prioritisation process;

• Theme Category Divisor was found to be the

most important parameter. This parameter is an

indicator of the type of theme that a requirement is

associated with, identifying the strategic/tactical

nature of the theme. This could indicate that: a)

the use of themes had a large impact on prioriti-

sation; and b) strategic themes and requirements

were more likely to get included in the release.

6 CONCLUSION

The literature review of prioritisation of require-

ments in software releases for Software Product Line

with Multiple Business Lines (SPL/MBL) enlight-

ened many strategies and methods however the ex-

isting strategies would not ﬁt best for the present use

case. Investigation into the Machine Learning mod-

els led to the implementation of ﬁve of them and suc-

cessfully compare their performance. The dataset was

tuned and features were carefully selected. All se-

lected models were trained and tested to get the pre-

dictions. Most models were able to achieve 80%

accuracy however further investigation and testing

yielded better results. The best results were achieved

with Decision Tree classiﬁer, Random forest and

K-Nearest Neighbours. Decision Tree Classiﬁer is

known to be prone to overﬁtting at times and Random

Forest can overcome the overﬁtting problem. Hence

hyper parameter tuning was performed for Random

Forest which gave 100% accuracy in many perfor-

mance metrics and 91% at k-fold cross validation.

However, the computational effort was considerably

high after hyper parameter tuning. In future, hyper

parameter tuning may be performed for other models

Software Requirements Prioritisation Using Machine Learning

899

Table 9: Results for the Full and Tuned Datasets.

Model Accuracy F1 Score K-Fold Cross Vali-

dation(Mean)

Execution time

full

dataset

tuned

dataset

full

dataset

tuned

dataset

full

dataset

tuned

dataset

full

dataset

tuned

dataset

Decision Tree 0.96 0.94 0.96 0.94 0.89 0.90 0.0860982 0.085749

Random Forest 0.94 0.94 0.94 0.86 0.89 0.92 2.0259862 2.2580502

Logic Regression 0.86 0.88 0.86 0.87 0.76 0.76 2.6350644 3.1730906

K Nearest Neighbour 0.94 0.92 0.94 0.92 0.80 0.82 4.106245 3.1730906

SVM 0.87 0.88 0.86 0.88 0.87 0.89 3.8039046 3.282107

to explore and evaluate the results and derive further

conclusions.

REFERENCES

Abbas, M., Jongeling, R., Lindskog, C., Enoiu, E. P., Saa-

datmand, M., and Sundmark, D. (2020). Product line

adoption in industry: An experience report from the

railway domain. In Proceedings of the 24th ACM Con-

ference on Systems and Software Product Line. Asso-

ciation for Computing Machinery.

Ali, S., Hafeez, Y., Hussain, S., Yang, S., and Jamal, M.

(2021). Requirement prioritization framework using

case-based reasoning: A mining-based approach. Ex-

pert Systems, 38(8):e12770.

Ashton, K. (2009). The ’internet of things’ thing.

Avesani, P., Perini, A., Siena, A., and Susi, A. (2015). Goals

at risk? machine learning at support of early assess-

ment. In 2015 IEEE 23rd International Requirements

Engineering Conference (RE), pages 252–255.

Carlshamre, P., Sandahl, K., Lindvall, M., Regnell, B., and

Natt och Dag, J. (2001). An industrial survey of re-

quirements interdependencies in software product re-

lease planning. In Proceedings 5th IEEE International

Symposium on Requirements Engineering, pages 84–

91.

Devroey, X., Perrouin, G., Cordy, M., Samih, H., Legay,

A., Schobbens, P.-Y., and Heymans, P. (2017). Sta-

tistical prioritization for software product line testing:

an experience report. Software & Systems Modeling,

16(1):153–171.

Duan, C., Laurent, P., Cleland-Huang, J., and Kwiatkowski,

C. (2009). Towards automated requirements pri-

oritization and triage. Requirements engineering,

14(2):73–89.

uner, S., Burger, A., Kantonen, T., and R

uckert, J. (2020).

Incremental migration to software product line engi-

neering. In Proceedings of the 24th ACM Conference

on Systems and Software Product Line, pages 1–11.

Kuhn, M. and Johnson, K. (2013). Applied predictive mod-

eling. Springer, London.

Metzger, A. and Pohl, K. (2014). Software product line en-

gineering and variability management: Achievements

and challenges. In Future of Software Engineering

Proceedings.

Montalvillo, L. and Diaz, O. (2016). Requirement-driven

evolution in software product lines: A systematic

mapping study. Journal of Systems and Software,

122:110 – 143.

Perini, A., Susi, A., and Avesani, P. (2013). A machine

learning approach to software requirements prioriti-

zation. IEEE Transactions on Software Engineering,

39(4):445–461.

Pohl, K., B

ockle, G., and Van Der Linden, F. (2005). Soft-

ware product line engineering: foundations, princi-

ples, and techniques, volume 1. Springer.

Pronk, B. J. (2002). Product line introduction in a multi-

business line context. International Workshop on

product Line Engineering: The Early Steps: Plan-

ning, Modelling and Managing.

Qayyum, S. and Qureshi, A. (2018). A survey on ma-

chine learning based requirement prioritization tech-

niques. In Proceedings of the 2018 International Con-

ference on Computational Intelligence and Intelligent

Systems, pages 51–55.

Saaty, T. L. (1977). A scaling method for priorities in hier-

archical structures. Journal of Mathematical Psychol-

ogy, 15(3):234 – 281.

Sadiq, M., Sadim, M., and Parveen, A. (2021). Applying

statistical approach to check the consistency of pair-

wise comparison matrices during software require-

ments prioritization process. International Journal

of System Assurance Engineering and Management,

pages 1–10.

Scornet, E. (2020). Trees, forests, and impurity-based vari-

able importance. arXiv preprint arXiv:2001.04295.

Sommerville, I. (2016). Software engineering. Boston :

Pearson Education Ltd, 10th edition.

Svahnberg, M., Gorschek, T., Feldt, R., Torkar, R., Saleem,

S. B., and Shaﬁque, M. U. (2010). A systematic re-

view on strategic release planning models. Informa-

tion and Software Technology, 52(3):237–248.

Tonella, P., Susi, A., and Palma, F. (2013). Interactive re-

quirements prioritization using a genetic algorithm.

Information and software technology, 55(1):173–187.

Wiegers, K. and Beatty, J. (2013). Software Requirements.

Microsoft Press, Redmond, Washington, 3rd edition.

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

900