Towards Interpretable Monitoring and Assignment of Jira Issues

Dimitrios-Nikitas Nastos

, Themistoklis Diamantopoulos

and Andreas Symeonidis

Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Thessaloniki, Greece

Keywords:

Project Management, Task Management, Jira Issues, Topic Modeling.

Abstract:

Lately, online issue tracking systems like Jira are used extensively for monitoring open-source software

projects. Using these systems, different contributors can collaborate towards planning features and resolv-

ing issues that may arise during the software development process. In this context, several approaches have

been proposed to extract knowledge from these systems in order to automate issue assignment. Though effec-

tive under certain scenarios, these approaches also have limitations; most of them are based mainly on textual

features and they may use techniques that do not extract the underlying semantics and/or the expertise of the

different contributors. Furthermore, they typically provide black-box recommendations, thus not helping the

developers to interpret the issue assignments. In this work, we present an issue mining system that extracts

semantic topics from issues and provides interpretable recommendations for issue assignments. Our system

employs a dataset of Jira issues and extracts information not only from the textual features of issues but also

from their components and their labels. These features, along with the extracted semantic topics, produce an

aggregated model that outputs interpretable recommendations and useful statistics to support issue assignment.

The results of our evaluation indicate that our system can be effective, leaving room for future research.

1 INTRODUCTION

Nowadays, open-source projects are developed and

maintained online using code hosting facilities like

GitHub and monitored with issue tracking systems

like Jira. This collaborative paradigm dictates that a

project may have multiple contributors with different

levels of expertise and experience, who must all work

together in coordination to design and develop fea-

tures, resolve issues/bugs, plan and craft releases, and

generally monitor the development of the project.

As a result, contemporary issue tracking systems

function as a hub of useful knowledge for project

monitoring and decision making. The problem, how-

ever, is that, as the project grows, knowledge is harder

to extract and transfer among existing and new con-

tributors. And this may lead to several challenges. For

instance, when a new bug arises, one must have a clear

view of the project in order to be able to assess its im-

pact, determine the relevant components that may be

affected and assign it to the most relevant contributor.

In this context, several approaches have been de-

veloped for extracting information from issue track-

https://orcid.org/0009-0007-2240-2835

https://orcid.org/0000-0002-0520-7225

https://orcid.org/0000-0003-0235-6046

ing systems with the aim of helping developers better

manage the software project under analysis. These

approaches aspire to confront various challenges, in-

cluding e.g. determining the most suitable developer

for resolving a new issue (Murphy and Cubranic,

2004; Anvik et al., 2006; Matsoukas et al., 2020; Alk-

hazi et al., 2020), assessing the severity and/or the pri-

ority of a bug (Sharma et al., 2012; Tian et al., 2015;

Kanwal and Maqbool, 2012; Diamantopoulos et al.,

2021; Lamkanﬁ et al., 2010; Yang et al., 2012), or

even extracting the roles of the different developers

(Li et al., 2016; Onoue et al., 2013; Gousios et al.,

2008; Lima et al., 2015; Papamichail et al., 2019).

Although these approaches are effective in certain

scenarios, they also have important limitations. First

of all, several approaches employ only the textual fea-

tures of issues (i.e. titles and descriptions), thus disre-

garding features like the component hierarchy or the

labels of the project, which may point to the expertise

of the different contributors. Moreover, they do not

always employ semantics-enabled methods, thus they

may miss signiﬁcant correlations between the differ-

ent issues (and areas) of the project. Finally, and most

importantly, they are usually built as black boxes and

provide, at best, a probability. As a result, they do not

help the contributors understand the reasoning behind

696

Nastos, D., Diamantopoulos, T. and Symeonidis, A.

Towards Interpretable Monitoring and Assignment of Jira Issues.

DOI: 10.5220/0012146400003538

In Proceedings of the 18th International Conference on Software Technologies (ICSOFT 2023), pages 696-703

ISBN: 978-989-758-665-1; ISSN: 2184-2833

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

recommendations so that they can make the optimal

decisions for the project.

In this work, we present an issue monitoring and

assignment methodology that helps contributors bet-

ter understand the project and provides interpretable

issue assignment recommendations. Our methodol-

ogy uses a dataset of Jira issues and employs informa-

tion both from the textual and from the grouping fea-

tures (components, labels) of issues. Furthermore, we

extract semantic knowledge from issues in the form of

topics using the BERTopic topic modeling technique

(Grootendorst, 2022) and build an aggregated model

that assigns issues to contributors, while also provid-

ing useful statistics to support decision making.

2 RELATED WORK

As already mentioned, contemporary approaches that

employ issue tracking systems confront various chal-

lenges, including automated issue assignment (Mur-

phy and Cubranic, 2004; Anvik et al., 2006; Mat-

soukas et al., 2020; Alkhazi et al., 2020), bug severity

or priority prediction (Sharma et al., 2012; Tian et al.,

2015; Kanwal and Maqbool, 2012; Diamantopoulos

et al., 2021; Lamkanﬁ et al., 2010; Yang et al., 2012),

and even developer role extraction (Li et al., 2016;

Onoue et al., 2013; Gousios et al., 2008; Lima et al.,

2015; Papamichail et al., 2019). Our work lies in the

scope of project monitoring and speciﬁcally in auto-

mated issue assignment, i.e. the problem of ﬁnding

the most suitable developer for the task at hand.

Most approaches in automated issue assignment

(or issue/bug triaging as it is also known) extract

features from data lying in Jira or Bugzilla installa-

tions, use some type of textual model and employ

classiﬁcation algorithms to determine the most suit-

able developer based on any past issues he/she has

resolved. One such approach is proposed by Mur-

phy and Cubranic (2004), who employ a vector space

model to map the bug reports of the Eclipse project

and use Na

ıve Bayes to perform the classiﬁcation,

which is thus solely based on textual features. A sim-

ilar approach is proposed by Anvik et al. (2006); the

authors further incorporate the current workload and

the vacation schedule of the different contributors,

while they also use Support Vector Machines (SVM),

managing to improve the accuracy of the assignments.

Though interesting, the aforementioned ap-

proaches rely on vector spaces based only on term

frequencies, without incorporating the underlying se-

mantics of issue text. To improve on this aspect, lately

several researchers employ semantics-enabled meth-

ods, either using word embeddings (Guo et al., 2020;

He and Yang, 2021) or using topic modeling tech-

niques (Ahsan et al., 2009; Yang et al., 2014; Naguib

et al., 2013). For instance, Guo et al. (2020) employ

word embeddings to model issue titles and descrip-

tions and use a Convolutional Neural Network (CNN)

to produce a list of recommended developers along

with the extracted probabilities. He and Yang (2021)

further extend this line of thought by considering both

word2vec and GloVe as representations and employ-

ing attention networks to make the ﬁnal assignment.

Topic modeling techniques typically employ La-

tent Dirichlet Allocation (LDA) on the textual fea-

tures of issues to extract semantic topics that are sub-

sequently used to improve the classiﬁcation. Yang

et al. (2014) ﬁrst use the extracted topics in order to

detect similar bug reports, and then use the textual

features of these reports in order to make the assign-

ment. An interesting extension is proposed by Naguib

et al. (2013), who further consider the issues resolved

and the issues reviewed for every developer (instead

of only the issues assigned to him/her). Ahsan et al.

(2009) also extract semantic topics, however they em-

ploy Latent Semantic Indexing (LSI) and follow a

slightly different approach for classiﬁcation. Speciﬁ-

cally, the authors ﬁrst create a term-to-document ma-

trix using a vector space model and then use the result

of the LS in order to reduce the dimensionality of the

matrix. Upon assessing different classiﬁers, they con-

clude that SVM provide the best performance.

Finally, there are also approaches that use code

data. For instance, Alkhazi et al. (2020) process both

issues and commits from the Eclipse project, while

Matsoukas et al. (2020) use commits and issues ex-

tracted from GitHub (Diamantopoulos et al., 2020)

to build models based on issue text, issue comments,

labels, and commit comments. Though interesting,

these approaches deviate from the scope of this work.

Although the aforementioned solutions are effec-

tive for automated issue assignment, they typically

present assignment recommendations without con-

sidering interpretability. Most of them provide as-

signments and probabilities based on textual features,

while certain systems do not capture the underlying

semantics of the project, which can help understand

the issues and even the expertise of contributors. Our

methodology confronts the above limitations, by em-

ploying both textual and non-textual features, so that

the recommendations incorporate the expertise of the

developers in terms of the relevant components and

semantic topics that each developer is familiar with.

Furthermore, the output of our system is a ranked list,

along with useful information for each developer in

order to easily make an informed decision about the

assignment of new issues.

Towards Interpretable Monitoring and Assignment of Jira Issues

697

Extracted

Topics

Models

Data

Preprocessing

Model

Building

Textual Index

Title & Description

Component

Mapping

Aggregated

Assignments

& Statistics

Labels

Index

FEATURE

BUG

DOCS

Model

Aggregation

Figure 1: Overview of our Issue Mining Methodology.

3 METHODOLOGY

The overview of our issue mining methodology is de-

picted in Figure 1. The input of our system is in the

form of Jira issue tracking data, which are the issues

of 656 projects of the Apache Software Foundation

(Diamantopoulos et al., 2023). Each issue provides

different types of information, including textual and

non-textual features. In our case, we extract the ti-

tle and description of the issue as well as the relevant

labels and components that provide meta-data about

it with respect to the project. Moreover, we extract

the name of the assignee that solved the issue. As

this information is necessary for the task of issue as-

signment, we ﬁlter out any issues that do not have

data in those ﬁelds. Upon extraction, we preprocess

the data and build different models: a textual index

for the issue texts and description, a semantic topic

model, and two mappings, one for components and

one for labels. The semantic topic model is extracted

using BERTopic, a topic modeling technique that uses

the BERT deep learning language model so that we

enable a deeper contextual understanding of the rela-

tionships between the different issues. Finally, upon

extracting the information, we export useful diagrams

that can be communicated to the users of the project

(and especially the triager) and build an aggregated

model that combines all extracted knowledge to pro-

duce interpretable issue assignments.

3.1 Data Preprocessing

Our analysis is performed on a per-project basis. To

effectively confront the issue assignment challenge,

the issue data for each project must contain assignee

information. Therefore, any issues lacking this infor-

mation are excluded from the ﬁnal dataset. Further-

more, to ensure meaningful results, we further ﬁltered

our dataset to only include developers who have re-

solved at least 100 issues and the corresponding is-

sues. This cutoff ensures that the assignee has a sufﬁ-

cient history of issue resolution and thus the informa-

tion can be used to extract better analytics.

For the textual features of our dataset, we com-

bined two ﬁelds: the title (name summary in Jira)

and the description. Before proceeding to our index-

ing and topic modeling methods, we ﬁrst preprocess

the textual data in two steps to produce two different

versions of the text features, each suited for different

tasks, the application of BERTopic and the application

of Tf-Idf. The ﬁrst step involves using regular expres-

sions to remove HTML tags, special characters, and

links, which are considered noise in our data. After

this step, the data can be forwarded to the BERTopic

topic modeling technique, which requires the full con-

text of the text features. For the Tf-Idf model, we also

proceed to the second step, which further reﬁnes the

data to produce accurate representations of each text

feature. It includes the conversion of all text to lower-

case, the removal of stopwords, punctuation, and dig-

its, and the lemmatization of the remaining words.

Following preprocessing, we split the issues for

each project into training and test sets to train and

evaluate the models we employ. The training set com-

prises 80% of the project issues, with the remaining

20% reserved for testing purposes.

3.2 Extracting Models for Issue Texts

To take into account the lexical overlap between is-

sues and to improve issue assignment, we ﬁrst use tex-

tual features. This involves transforming the text into

vector representations and training a classiﬁer to re-

ceive as input the text of an issue and provide a proba-

bility distribution of each potential assignee being the

most suitable one for resolving that issue. We employ

ICSOFT 2023 - 18th International Conference on Software Technologies

698

(a) Issue-Topic Distribution. (b) Topic-Developer Distribution.

Figure 2: Distributions of the top 20 topics extracted.

the Tf-Idf vectorizer to vectorize the text. This vector-

izer creates a vocabulary of all the words in the issues

texts and then calculates the frequency of each word

inside a document (Tf) and the inverse document fre-

quency (Idf) of each word in the entire collection of

documents. The Idf term is used to minimize the in-

ﬂuence of very common words. Each document is

represented by a vector of products of term frequen-

cies and inverse document frequencies of each word.

In speciﬁc, the value of each term t in a document d

belonging to a collection of documents D is given by

the equation:

T f − Id f (t, d, D) = T f (t, d) · Id f (t, D) (1)

After implementing the Tf-Idf vectorizer and convert-

ing all text strings in the training set of the project to

their corresponding vector representation, we train an

SVM classiﬁer. This classiﬁer is designed to output a

probability distribution that denotes the relevance of

each contributor to the given issue.

3.3 Modeling Topics

Our methodology extracts topics from the issue texts

to gain a better understanding of the semantics of

the project under analysis and to facilitate the cat-

egorization of the issues. To achieve this, we have

chosen to use BERTopic (Grootendorst, 2022), which

is a topic modeling technique that outperforms con-

ventional models such as LDA. BERTopic utilizes

transformer-based language models like BERT, which

enables it to identify semantic relationships between

texts more effectively than bag-of-words models used

by conventional models. BERTopic operates in three

main stages: text embedding generation, dimension-

ality reduction, and cluster creation based on the

new embeddings. To identify the most representative

terms for each cluster, it employs class-based Tf-Idf

(c-Tf-Idf) to generate topic representations.

We have applied BERTopic to the training set of

projects, which has resulted in the extraction of topics

for each project. By utilizing the trained BERTopic

model, we can extract the number of issues assigned

to each topic and the contribution of each assignee to

each topic, indicating the number of issues assigned

to each assignee for each topic. This information is

then used to generate a distribution of each assignee

across all topics. Each assignee gets a value between

0 and 1 for each topic, representing the percentage of

the topic’s issues that he/she has been assigned.

Figure 2 depicts the top 20 topics for the OAK

project. Apache OAK is a hierarchical content repos-

itory for the Java platform, which is used as an ex-

ample throughout our paper. It includes multiple fea-

tures for storing and indexing structured and unstruc-

tured content and allows different querying methods

(e.g. SQL, XPath). Indeed, using our visualizations,

one can immediately see that the project has well-

separated topics (Figure 2a) relevant to Lucene index-

ing (topic 19), to databases like MongoDB (topic 16),

to XPath querying (topic 8), etc. Furthermore, given

the topic-developer distribution (Figure 2b, triagers

can easily identify the most relevant topics for each

contributor at a glance. For instance, user tmueller

seems to have extended expertise in XPath (topic 8),

while user reschke is better acquainted with connect-

ing databases (Postgres and JDBC in topic 13). Thus

using this information, it is possible to classify issues

to contributors according to their expertise, and, most

importantly, to facilitate the ﬁnal decision about issue

assignment based on the relationship of the develop-

ers with the topic the issue belongs.

Towards Interpretable Monitoring and Assignment of Jira Issues

699

Figure 3: Heatmap of the Distribution between Components and Contributors.

Figure 4: Heatmap of the Distribution between Labels and Contributors.

3.4 Component Mapping

Another feature that can assist in the assignment of

the most appropriate developer for an issue is the

component to which the issue belongs. A component

represents a subcategory of the project the issue is part

of. For example, a database library could have com-

ponents such as disk input/output, network commu-

nication, etc. When this information is available, we

can use it to create a developer-component distribu-

tion, which corresponds to the percentage of issues for

each component that have been assigned to each de-

veloper. In a similar manner to the assignee-topic dis-

tribution, this information can be utilized to make in-

formed decisions about assigning issues to developers

who are best suited to handle them based on their level

of experience with the speciﬁc component. Figure 3

shows this distribution for project OAK. As before,

one can also ﬁnd useful connections; for instance,

user teoﬁli has handled most issues relevant to the

SOLR indexing service, thus we may assume that the

user has the relevant expertise. Interestingly, there are

also cases where the expertise is shared. For example,

concerning caching issues, these are mostly handled

by two developers, tmueller and tomek.rekawek.

3.5 Indexing Labels

Labels are tags or categories that provide additional

information which helps categorizing issues. Simi-

larly to topics and components, they can provide use-

ful insight when trying to make the best choice of de-

veloper for every issue in a project. In projects and

issues where labels are available, they can be used for

the creation of a label-contributor distribution. An ex-

ample distribution for project OAK is shown in Fig-

ure 4. As depicted, labels can have several scopes; for

instance, there are generic labels relevant to mainte-

nance (e.g. ‘maintenance’ or ‘technical debt’) or test-

ing (e.g. ‘test’ or ‘test failure’), and of course there

are also labels corresponding to different areas of the

project (e.g. ‘datastore’ or ‘osgi’). These can be used

ICSOFT 2023 - 18th International Conference on Software Technologies

700

Table 1: Example of Issue Assignment Monitoring.

Issue Title: Prefetch external changes

Issue Description: In a cluster with listeners that are registered to receive external changes, pulling in external

changes can become a bottleneck. While processing those changes, local changes are put into the observation

queue leading to a system where the queue eventually ﬁlls up. Instead of processing external changes one after

another, the implementation could prefetch them as they come in and if needed pull them in parallel.

Contributor mreutegg (score 37.67% / issue text similarity 42.75%)

Has resolved 31.82% of the issues with topic 21 observation events listeners listener

Has resolved 43.68% of the issues in component core

Has resolved 32.43% of the issues with label observation

Contributor mduerig (score 30.27% / issue text similarity 26.84%)

Has resolved 36.36% of the issues with topic 21 observation events listeners listener

Has resolved 6.52% of the issues in component core

Has resolved 51.35% of the issues with label observation

Contributor mduerig (score 8.32% / issue text similarity 7.16%)

Has resolved 18.18% of the issues with topic 21 observation events listeners listener

Has resolved 7.93% of the issues in component core

Has resolved 0% of the issues with label observation

to identify the expertise of each developer and provide

useful hints about his/her role(s) in the project (e.g. a

developer that resolves issues with testing labels may

be responsible for testing certain project modules).

3.6 Recommending Issue Assignments

We now have 4 distributions to support issue assign-

ment. When a new issue occurs, the triager can use

the models and the distributions for the corresponding

project, combine their results and select the most suit-

able developer for the task according to these calcu-

lations. Our system does this aggregation by comput-

ing the mean value of every potential assignee across

the four distributions and recommends the top 3 most

relevant assignees for the issue under triaging. An-

other important aspect of our system is that it does not

only propose the most suitable assignees but it also

indicates the reasons for their suitability based on the

information that can be extracted from the distribu-

tions. This information can help the triager to select

the most suitable developer, considering the charac-

teristics of the issue. For example, he/she may choose

to take into account only the topics-assignee distribu-

tion and thus assign the issue to the developer that

has more experience in the topic of the issue. Or

he/she may choose to base his/her decision on who

has worked the most in the relevant component.

Table 1 depicts an example issue, along with the

ranked list of potential assignees. The issue originates

from the OAK project and is relevant to improve-

ments in an event processing scenario. It is part of the

core component of the project, while it is relevant to

observation (label observation). Moreover, applying

our topic modeling technique showed that it is related

to topic 21 with top terms observation, events, listen-

ers, etc. Upon producing the distributions for texts,

topics, components, and labels, we compute also their

aggregation by taking the average of the four different

values for every assignee on each distribution.

The system returns the 3 assignees that have the

highest average values, which in this example are

mreutegg, mduerig and chetanm. Given the ranking

of Table 1, one could immediately choose to assign

the issue to mreutegg, who has the best aggregated

score (37.67%). Indeed, mreutegg has also resolved

issues that are textually similar, while also resolving

a large fraction (43.68%) of issues related to the core

component. However, concerning the topic of ob-

servation, mduerig seems to exhibit higher expertise,

given that the contributor has resolved a signiﬁcant

amount of issues relevant to this topic, while also hav-

ing resolved more than half of the issues with label

observation. The actual ground truth assignment in

this case is mreutegg, although mduerig would seem

to be an acceptable choice. What is interesting is

that, using our methodology, one can truly understand

which developer to choose and why. Thus, in this

case, the triager could select the most active developer

(component-wise) or the one with the most expertise

(topic and label-wise). And of course, he/she could

also use the information provided in this list to make

a more complex assignment, e.g. assigning the task to

mreutegg and setting mduerig as its reviewer.

Towards Interpretable Monitoring and Assignment of Jira Issues

701

4 EVALUATION

4.1 Evaluation Framework

In this section, we evaluate the performance of our

system on 10 projects, shown in Table 2, along with

their number of issues and number of contributors that

meet the requirements set in the previous section. For

our evaluation, we utilize accuracy, which expresses

the percentage of issues for which our system’s ﬁrst

assignee choice is correct. Moreover, we employ the

Mean Reciprocal Rank (MRR) for all issues of each

project, computed as the average of the reciprocal

rank of each issue assignment, where the reciprocal

rank is the inverse of the rank of the correct assignee

(e.g. if the assignee is in the second position, then the

reciprocal rank for the issue is 1/2 = 0.5).

Table 2: Projects of the Evaluation Dataset.

Project #Issues #Contributors

ARROW 7122 21

CXF 5987 12

FELIX 3941 10

GROOVY 6294 9

HDDS 2882 13

KARAF 5781 7

OAK 6879 15

OFBIZ 8477 20

SLING 7945 17

UIMA 5366 12

4.2 Evaluation Results

In Figure 5 we see that the accuracy for all projects

is higher that 50% and some of them reach almost

75%, which is effective, especially if we take into ac-

count the number of contributors. These results show

that the proposed system can be very effective, even

with returning only one choice as suitable. The results

for MRR (Figure 6) are also quite encouraging; in all

projects the MRR is larger than 0.6, meaning that on

average the correct assignee is in the top 2 positions.

5 CONCLUSION

Nowadays, effective collaboration in issue tracking

systems can have signiﬁcant inﬂuence on the software

development process. In this paper, we focused on the

challenge of automated issue assignment, extracting

semantic topics from Jira issues with the aim of rec-

ommending the most suitable developer for resolving

Figure 5: Accuracy of Issue Assignment per Project.

Figure 6: MRR of Issue Assignment per Project.

an issue. Unlike other approaches, our methodology

employs information about the components, the la-

bels, and the generated topics to produce a set of inter-

pretable recommendations, thus truly supporting the

decision making process. Upon assessing our system,

we conclude that it can be effective for recommending

assignees, while maintaining its intepretability.

Future work lies in several directions. First of all,

we plan to build a graphical user interface in order to

better illustrate the potential of our approach and bet-

ter assess its impact. Moreover, one could test differ-

ent combinations of features (e.g. issue type or sever-

ity) or even different of models, including e.g. word

embeddings, and further assess the effectiveness of

our methodology. Finally, an interesting idea would

be to extend our system in order to cover other chal-

lenges, such as issue priority or severity prediction.

ACKNOWLEDGEMENTS

Parts of this work have been supported by the Horizon

Europe project ECO-READY (Grant Agreement No

101084201), funded by the European Union.

ICSOFT 2023 - 18th International Conference on Software Technologies

702

REFERENCES

Ahsan, S. N., Ferzund, J., and Wotawa, F. (2009). Auto-

matic software bug triage system (bts) based on la-

tent semantic indexing and support vector machine. In

Proceedings of the 2009 Fourth International Confer-

ence on Software Engineering Advances, ICSEA ’09,

page 216–221, USA. IEEE Computer Society.

Alkhazi, B., DiStasi, A., Aljedaani, W., Alrubaye, H., Ye,

X., and Mkaouer, M. W. (2020). Learning to rank

developers for bug report assignment. Applied Soft

Computing, 95:106667.

Anvik, J., Hiew, L., and Murphy, G. C. (2006). Who should

ﬁx this bug? In Proceedings of the 28th International

Conference on Software Engineering, ICSE ’06, pages

361–370, New York, NY, USA. ACM.

Diamantopoulos, T., Galegalidou, C., and Symeonidis,

A. L. (2021). Software task importance prediction

based on project management data. In 16th Interna-

tional Conference on Software Technologies, ICSOFT

2021, pages 269–276, Held Online. SciTePress.

Diamantopoulos, T., Nastos, D.-N., and Symeonidis, A.

(2023). Semantically-enriched jira issue tracking data.

In Proceedings of the 20th International Conference

on Mining Software Repositories, MSR ’23, pages

218–222, Melbourne, Australia. IEEE.

Diamantopoulos, T., Papamichail, M., Karanikiotis, T.,

Chatzidimitriou, K., and Symeonidis, A. (2020). Em-

ploying contribution and quality metrics for quantify-

ing the software development process. In Proceedings

of the IEEE/ACM 17th International Conference on

Mining Software Repositories, MSR ’20, pages 558–

562, Seoul, South Korea. ACM.

Gousios, G., Kalliamvakou, E., and Spinellis, D. (2008).

Measuring developer contribution from software

repository data. In Proceedings of the 2008 In-

ternational Working Conference on Mining Software

Repositories, pages 129–132, NY, USA. ACM.

Grootendorst, M. (2022). Bertopic: Neural topic model-

ing with a class-based tf-idf procedure. arXiv preprint

arXiv:2203.05794.

Guo, S., Zhang, X., Yang, X., Chen, R., Guo, C., Li, H.,

and Li, T. (2020). Developer activity motivated bug

triaging: Via convolutional neural network. Neural

Process. Lett., 51(3):2589–2606.

He, H. and Yang, S. (2021). Automatic bug triage using

hierarchical attention networks. In 2021 IEEE 21st

International Conference on Software Quality, Relia-

bility and Security Companion, pages 1043–1049.

Kanwal, J. and Maqbool, O. (2012). Bug Prioritization to

Facilitate Bug Report Triage. Journal of Computer

Science and Technology, 27(2):397–412.

Lamkanﬁ, A., Demeyer, S., Giger, E., and Goethals, B.

(2010). Predicting the Severity of a Reported Bug. In

2010 7th IEEE Working Conference on Mining Soft-

ware Repositories, MSR ’10, pages 1–10. IEEE Press.

Li, S., Tsukiji, H., and Takano, K. (2016). Analysis of

Software Developer Activity on a Distributed Version

Control System. In Proceedings of the 30th Inter-

national Conference on Advanced Information Net-

working and Applications Workshops, pages 701–707.

IEEE.

Lima, J., Treude, C., Filho, F. F., and Kulesza, U.

(2015). Assessing developer contribution with reposi-

tory mining-based metrics. In Proceedings of the 2015

IEEE International Conference on Software Mainte-

nance and Evolution, pages 536–540, USA. IEEE.

Matsoukas, V., Diamantopoulos, T., Papamichail, M., and

Symeonidis, A. (2020). Towards analyzing contribu-

tions from software repositories to optimize issue as-

signment. In 2020 IEEE International Conference on

Software Quality, Reliability and Security, QRS 2020,

pages 243–253, Vilnius, Lithuania. IEEE Press.

Murphy, G. and Cubranic, D. (2004). Automatic Bug

Triage using Text Categorization. In Proceedings of

the 16th International Conference on Software Engi-

neering & Knowledge Engineering, SEKE ’04, pages

92–97, USA. Knowledge Systems Institute.

Naguib, H., Narayan, N., Br

ugge, B., and Helal, D. (2013).

Bug report assignee recommendation using activity

proﬁles. In Proceedings of the 10th Working Confer-

ence on Mining Software Repositories, MSR ’13, page

22–30. IEEE Press.

Onoue, S., Hata, H., and Matsumoto, K.-i. (2013). A

Study of the Characteristics of Developers’ Activities

in GitHub. In Proceedings of the 20th Asia-Paciﬁc

Software Engineering Conference, pages 7–12, USA.

IEEE.

Papamichail, M. D., Diamantopoulos, T., Matsoukas, V.,

Athanasiadis, C., and Symeonidis, A. L. (2019). To-

wards extracting the role and behavior of contributors

in open-source projects. In 14th International Confer-

ence on Software Technologies (ICSOFT), pages 536–

543, Prague, Czech Republic. SciTePress.

Sharma, M., Bedi, P., Chaturvedi, K. K., and Singh, V. B.

(2012). Predicting the Priority of a Reported Bug us-

ing Machine Learning Techniques and Cross Project

Validation. In 2012 12th International Conference

on Intelligent Systems Design and Applications, ISDA

2012, pages 539–545. IEEE Press.

Tian, Y., Lo, D., Xia, X., and Sun, C. (2015). Automated

Prediction of Bug Report Priority Using Multi-Factor

Analysis. Empirical Softw. Engg., 20(5):1354–1383.

Yang, C.-Z., Hou, C.-C., Kao, W.-C., and Chen, I.-X.

(2012). An Empirical Study on Improving Severity

Prediction of Defect Reports Using Feature Selection.

In Proceedings of the 2012 19th Asia-Paciﬁc Software

Engineering Conference - Volume 01, APSEC ’12,

pages 240–249, USA. IEEE Computer Society.

Yang, G., Zhang, T., and Lee, B. (2014). Towards semi-

automatic bug triage and severity prediction based on

topic model and multi-feature of bug reports. In Pro-

ceedings of the 2014 IEEE 38th Annual Computer

Software and Applications Conference, COMPSAC

’14, page 97–106, USA. IEEE Computer Society.

Towards Interpretable Monitoring and Assignment of Jira Issues

703