Towards Human-Interpretable Prototypes for Visual Assessment of

Image Classiﬁcation Models

Poulami Sinhamahapatra

, Lena Heidemann

, Maureen Monnet

and Karsten Roscher

Fraunhofer IKS, Germany

Keywords:

Interpretability, Global Explainability, Classiﬁcation, Prototype-Based Learning.

Abstract:

Explaining black-box Artiﬁcial Intelligence (AI) models is a cornerstone for trustworthy AI and a prerequisite

for its use in safety critical applications such that AI models can reliably assist humans in critical decisions.

However, instead of trying to explain our models post-hoc, we need models which are interpretable-by-design

built on a reasoning process similar to humans that exploits meaningful high-level concepts such as shapes,

texture or object parts. Learning such concepts is often hindered by its need for explicit speciﬁcation and an-

notation up front. Instead, prototype-based learning approaches such as ProtoPNet claim to discover visually

meaningful prototypes in an unsupervised way. In this work, we propose a set of properties that those proto-

types have to fulﬁll to enable human analysis, e.g. as part of a reliable model assessment case, and analyse

such existing methods in the light of these properties. Given a ‘Guess who?’ game, we ﬁnd that these pro-

totypes still have a long way ahead towards deﬁnite explanations. We quantitatively validate our ﬁndings by

conducting a user study indicating that many of the learnt prototypes are not considered useful towards human

understanding. We discuss about the missing links in the existing methods and present a potential real-world

application motivating the need to progress towards truly human-interpretable prototypes.

1 INTRODUCTION

In recent years, Deep Neural Networks (DNNs) have

been shown to be increasingly proﬁcient in solving

more and more complex tasks. Increasing complex-

ity of tasks lead DNNs to churn billions of parame-

ters, vast pools of unstructured data and non-human

understandable internal representations to arrive at

these spectacular results. However, this only adds

to the complexity and opacity of the already black-

box DNNs. This calls for an increasing need to pro-

mote eXplainable Artiﬁcial Intelligence (XAI) meth-

ods, for improving interpretability, transparency, and

trustworthiness of AI (Adadi and Berrada, 2018). It is

even more critical to reason and explain the decision-

making process of DNNs, when such decisions are

applied in safety-critical use-cases like self-driving

cars or medical diagnosis (Tjoa and Guan, 2020).

These cases not only demand higher accountability to

ﬁgure out why and how if things go wrong but also a

provision to assess, debug and audit in cases of fail-

ures.

https://orcid.org/0000-0002-3873-9623

https://orcid.org/0000-0001-6336-3552

https://orcid.org/0000-0001-7502-139X

https://orcid.org/0000-0002-9458-104X

In general, XAI approaches try to map the in-

ternal learnt representations of DNNs into human-

interpretable formats. However, what constitutes a

sufﬁciently human-understandable interpretation is

still largely subjective of the XAI approach itself,

whether it is post-hoc or inherently interpretable or

seeks local/ global explanations. One novel direc-

tion is associated with learning representations which

can be explicitly tied to human-understandable higher

level concepts, e.g. predicting an image of a bird

red-billed hornbill could depend on the presence of

concepts like red bill. However, this requires ex-

plicit prior knowledge of relevant concepts and re-

lies on correspondingly annotated datasets. Instead,

unsupervised discovery of the relevant parts or pro-

totypes would enable the use of existing large-scale

datasets and open such approaches to more diverse

use-cases. Such prototypes can represent distinct

human-understandable concepts or sub-parts, e.g.

beaks, wings, tails which could together predict a

bird. While learning such representations in an un-

supervised scenario (absence of concept-level annota-

tion) in itself is a challenge, the other difﬁculty lies in

visualising such implicitly learnt representations un-

derstandable to human eye.

A very prominent line of work based on learning

interpretable prototypes has emerged where the fo-

878

Sinhamahapatra, P., Heidemann, L., Monnet, M. and Roscher, K.

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models.

DOI: 10.5220/0011894900003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP, pages

878-887

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

cus is to learn representative parts for the downstream

task, by designating the closest image part to a given

learnt representation, as prototypes These methods

help us steer ahead of the challenge of ﬁnding repre-

sentative prototypes without explicit concept-level su-

pervision enhancing some interpretability. Nonethe-

less, how useful are these interpretations with respect

to human assessment of the model’s inner workings

and potential insufﬁciencies?

In this work, we closely investigate the perfor-

mance of selected interpretable prototype-based ap-

proaches in terms of qualitative interpretation using

a network called Prototypical Parts Network (Pro-

toPNet) (Chen et al., 2019) and subsequent variants

(Nauta et al., 2021; Gautam et al., 2021). They are

designed to learn representations of certain parts of

the training image class (prototypes) and then ﬁnd the

(parts of) test samples similar to the prototypes (“this-

looks-like-that”) based on similarity scores. To this

end, our key contributions are:

1. We design common setup of experiments and

accordingly propose requisite Desiderata (Sec-

tion 3.1) towards learning interpretable prototypes

beneﬁcial for the human assessment of a model

2. We analyse existing methods with respect to these

properties for real-world and synthetic datasets

(Section 4.1, 4.3)

3. We provide quantitative results based on a user

study to validate our ﬁnding (Section 4.2)

4. Finally, we motivate the application of in-

terpretable prototypes using real-world out-of-

distribution (OOD) detection task (Section 4.4)

and conclude with imminent challenges and po-

tential directions (Section 5).

2 INTERPRETABILITY SO FAR

According to (P

aez, 2019), Interpretability means an

AI system’s decision can be explained globally or lo-

cally and the system’s purpose as well as decision

can be understood by a human actor. There exists a

vast pool of XAI literature pertaining to visual tasks

(Nguyen et al., 2019).

Following are some major distinguishing factors

towards choosing type of explanation for a given XAI

approach:

Local vs. Global: Several vision approaches us-

ing DNNs focus on local explanations (Zhang and

Zhu, 2018) limited to speciﬁc few samples or high-

lighting speciﬁc parts of the image that the DNNs at-

tended to for a given decision (i.e. regions with high-

est attribution), say by generating heatmaps. These

localised analysis methods often involve generating

saliency-based activation maps (Zhou et al., 2016),

local sensitivity based on gradients (Sundararajan

et al., 2017), or relevance back-propagation (Shriku-

mar et al., 2017) such as LRP (Layer-wise Rele-

vance Propagation) (Bach et al., 2015). In contrast,

global explanations provide analysis on the models

as a whole, independent of individual samples e.g.

mapping certain concepts to internal latent represen-

tations. This provides a wider scope for general appli-

cability. For tasks like OOD detection, global expla-

nations are even necessary to detect new OOD sam-

ples.

Post-Hoc vs. Interpretable-by-Design: Most local

explanation methods are also post-hoc interpretations,

which involve taking a pre-trained model and then

identifying relevant features via attribution or trying

to understand the inner workings a posteriori. Since

these explanations are not tied to the inner workings

of the model, they can be unreliable. Thus, we need

inherently interpretable models, i.e. interpretable-by-

design (IBD), such that DNNs are designed in a way

to make internal representations interpretable. IBD

methods have gained much momentum because if

we want our models to be interpretable, we need to

consciously design them so (Rudin, 2019). One re-

cent direction towards IBD models is to map human-

understandable concepts or prototypes into internal

representations, e.g. by embedding an interpretable

layer into the network like in concept bottleneck mod-

els (CBM) (Koh et al., 2020), ProtoPNet models etc.,

or by enforcing single concepts into a model by in-

cluding their outputs in the loss function (Zhang et al.,

2018).

Explicitly Speciﬁed vs. Implicitly Derived: When

we want our learnt representations to be human-

understandable, we can tie them to either an explic-

itly speciﬁed ‘concept’ from natural human language

or we can learn ‘prototypes’ which are implicitly de-

rived. Prototypes are semantically relevant visual

explanations often represented by the closest train-

ing image, parts of an image, or via decoding ap-

proaches (Li et al., 2018). In concept learning, one

tries to associate known semantic concepts to latent

spaces (Fang et al., 2020).However, the availability

of datasets with annotated concepts or even the prior

knowledge of the expected concepts are quite limited.

In such cases, prototype-based learning using an IBD

method provides an alternative to learn global expla-

nations without the need for concept-speciﬁc annota-

tions.

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models

879

3 HUMAN-INTERPRETABLE

PROTOTYPES

3.1 Desiderata

When we assume concepts as something explicitly

speciﬁed, we basically refer to particular examples

that we can recollect from memory (e.g. bird with

red bill). In case of prototypes, they are often average

representation over several observed examples (Stam-

mer et al., 2022). While the task is to learn indepen-

dent underlying representations as prototype vectors,

a precise visualisation of the prototype vectors in a

human-understandable format is still challenging. In

literature, prototypes have been interchangeably re-

ferred to as representations for a full image or seman-

tically relevant sub-parts of it. In this work, we con-

sider the latter usage for ﬁne-grained interpretability

and subsequently chalk out desired properties towards

learning interpretable prototypes that are beneﬁcial

for human assessment of a model:

1. Human-understandable / Interpretable: The vi-

sualisation of the prototype vectors should corre-

spond to a distinct human-relevant entity. Often,

due to dataset biases, vague interpretations creep

in, like contours of objects or background colours,

which often lack in deﬁnite explanation.

2. Semantically Disentangled: Each prototype

should represent distinctly different semantic

units that can be associated with common inter-

pretation via humans.

3. Semantically Transformation Invariant: All pro-

totypes representing one semantic idea should be

uniquely represented irrespective of their variabil-

ity in scale, translation, or rotation angle across

different samples.

4. Relevant to the Learnt Task: The prototypes learnt

should add relevant information towards the task

learnt by the given ML model. It can either be

the whole semantic entity or distinct sub-parts of

it. E.g. for a classiﬁcation problem for cars, the

prototypes should be parts which are semantically

relevant for identifying a car, like wheels, doors

etc., whereas a pedestrian is not related to a car

directly.

Along with all of the above properties for inter-

pretable prototypes, it is also important to focus on

learning the prototypes with minimum concept-level

supervision. We learn prototypes under the assump-

tion that concept-level supervision is difﬁcult and ex-

pensive to get. In the rest of this work, we focus our

investigation on recent advances in prototype-based

learning methods presented in Section 3.2.

3.2 Learning and Visualising Prototypes

ProtoPNet (Chen et al., 2019) is an image classiﬁer

network that learns representations for relevant sub-

parts of an image as prototypes.

(i) Learning Prototypes: They are learnt by ap-

pending a prototype layer or a latent space to the

feature extractor. The prototypes are class-speciﬁc

(number of prototypes per class is pre-deﬁned) and

learnt by employing a cluster and separation loss on

top of the cross-entropy loss, which encourages se-

mantically similar samples to cluster together. Since

height and width of each prototype is smaller than the

feature layer, each prototype represents a patch cor-

responding to the feature layer in latent space and in

turn some prototypical part of the whole image x. In

this prototype layer, squared L

distances between the

prototype p

and all patches of z, having similar sizes

as p

, are calculated and inverted to obtain similarity

scores S.

(ii) Visualising Prototypes: The similarity scores

together constitute m activation heatmaps g

of same

spatial size as z. They indicate where and how

strongly a given prototype is present in z and are re-

duced to a single similarity score using global max

pooling. Based on maximum similarity scores after

comparison with all inputs from a prototype-speciﬁc

class, each prototype is projected onto the nearest

z. Since the spatial arrangement is preserved in the

heatmaps, they can be easily upsampled and over-

layed on the full image. The patch corresponding to

the maximum similarity score from the heatmap pro-

jected upon the input image is thus visualised as a pro-

totype.

In this work, we also analyse two successive meth-

ods proposing solutions for following shortcomings

concerning the aspects of learning and visualising

prototypes:

(a) Optimising visualising prototypes by address-

ing the problem of coarse and spatially imprecise up-

sampling: By upsampling the low resolution heatmap

for most relevant regions of interest for both prototype

training and test image, ProtoPNet tries to bring forth

the decision “this relevant prototype from this train-

ing image looks like that feature of that test image”.

However, the effective receptive ﬁeld in the original

image is much larger. Due to model-agnostic upsam-

pling, the region of interest in the ﬁnal input image

tends to imprecisely cover a lot more than the rele-

vant pixels. To address this problem, (Gautam et al.,

2021) proposed a method called Prototypical Rele-

vance Propagation (PRP) which builds upon the prin-

ciples of LRP. It aims to attain more accurate ﬁne-

grained model-aware explanations by backpropagat-

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

880

ing the relevances of the prototypes in ProtoPNet.

(b) Improved learning of prototypes without a

ﬁxed number of prototypes per class: Authors of

ProtoPNet advocated equal representation via a ﬁxed

number of prototypes per class leading to a lot of pro-

totypes for further analysis. (Nauta et al., 2021) pro-

pose to incorporate a soft decision tree, called Pro-

toTree, as a hierarchical model looking into a se-

quence of decisions through each node prototype to

arrive at every test sample. ProtoTree, being IBD by

design, allows retraceable decisions mimicking hu-

man reasoning while reducing the number of proto-

types to only 10% of ProtoPNet.

4 EXPERIMENTAL DISCUSSION

In this section, using different experiments we per-

form analysis of the interpretability of the results from

the prototype-based learning methods presented in

3.2. We ﬁrst look at image classiﬁcation tasks, where

we consider real-world datasets and much simpliﬁed

synthetic dataset in Sections 4.1, 4.3 respectively in

light of the desiderata in Section 3.1. In Section 4.2,

we provide quantitative statistics of our ﬁndings in

4.1 based on user-study. Finally, in Section 4.4, we

present a preliminary application of interpretable pro-

totypes in a real-world OOD detection task.

4.1 This Looks like that? - Analysis

Here, we provide insights on the interpretability of

learnt prototypes ProtoPNet, ProtoTree and PRP. Ex-

perimental setups have been kept identical to the re-

spective original implementations. For PRP, we have

reproduced the code as close as possible to mentioned

algorithms in (Gautam et al., 2021). Datasets used

for ﬁne-grained and generalised image classiﬁcation

are respectively Caltech UCSD Birds-200-2011 (200

bird classes) (Wah et al., 2011) and ImageNet-30 (30

distinct classes) (Hendrycks et al., 2019). ProtoP-

Net uses ImageNet pre-trained VGG19 models. Pro-

toTree uses ResNet-50 models pre-trained on Natu-

ralist for CUB-200 and ImageNet for ImageNet-30.

Following insights are drawn from the entire test data.

Let’s take a closer look at each method based on the

following properties:

4.1.1 Human-Understandable / Interpretable

ProtoPNet: In Figure 1, we present samples from

models trained on CUB-200 and ImageNet-30 with

75.9% and 97.0% test accuracy. In sub-ﬁgures (a), we

show the 3 closest train and test images for a given

prototype from each class. In a broader sense, the

prototypes, given the context where they are located,

bring forth the understanding that this patch in the

test image probably looks like that prototype. Most

prototypes can be successfully matched to somewhat

similar patches in test images. But the ‘standalone’

prototypes themselves are not so human-interpretable

such that they can be distinctly identiﬁed as a rele-

vant entity. E.g. for rusty blackbird, the prototype

shows the neck of the bird, however, from the similar-

ity activation maps for closest test images, the highest

similarity varies from eyes, beak to neck region. This

shows that the imprecise upsampling of the similarity

activation maps often leads to spurious identiﬁcation

of non-relevant parts. Similarly, for red-breasted mer-

ganser, the probable prototype showing the backside

of its head is confused with its beak, head or eyes. For

the snowmobile class of ImageNet-30, the skis pro-

totype is matched inaccurately to wheels, tracks and

even the whole vehicle in test images, thus leading to

inconclusive interpretations.

PRP: We compare similar classes in Figures 1(a)

and 2 for CUB-200 to ﬁnd potential improvement

using PRP for the imprecise upsampling mentioned

above. For western grebe, PRP reassuringly high-

lights the edges of the upper body as compared to

the imprecise body or tail of the bird shown in Pro-

toPNet. For red-breasted merganser however, PRP

highlights the beaks, while the prototype looks at the

backside of the head, leading to incoherent interpre-

tations. In most samples, PRP tends to focus on the

closest edges that might be salient when matched to

a prototype. Although these explanations are seem-

ingly more precise compared to ProtoPNet, this does

not always enhance the certainty or conviction of the

relevant parts for human interpretation.

ProtoTree: Figure 3 shows samples from models

trained using ProtoTree on CUB-200 and ImageNet-

30 with 82.1% and 91.8% test accuracies. In order to

look at the most relevant prototypes used for decid-

ing on a class, we chose classes from the rightmost

branch of the tree to ensure the traversed node proto-

types were ‘present’ in most cases in the decision path

for these classes, namely western grebe and snowmo-

bile from CUB-200 and ImageNet-30. As pointed out

by the authors, since the prototypes themselves are

not mapped to any particular class, the ﬁrst prototypes

in the given path are barely relevant for a given class

or the indicated matching parts. Prototype 1 in west-

ern grebe is hardly understandable, similarly the third

prototype looks at the black body of a bird but indi-

vidually the prototype is difﬁcult to comprehend.

Thus, the prototypes themselves are not entities

easily understandable to human-eye particularly, even

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models

881

Western grebe

Red-breasted

merganser

Rusty blackbird

Western grebe

Red-breasted

merganser

Rusty blackbird

(a)

(b)

1 2 3 4 5 6 7 8 9 10

Prototype

Nearest training patches

with similarity-based activation map

Nearest test patches

with similarity-based activation map

CUB-200

Snowmobile

(b) 1 2 3 4 5 6 7 8 9 10

Prototype

Nearest training patches

with similarity-based activation map

Nearest test patches

with similarity-based activation map

Snowmobile

(a)

ImageNet-30

Figure 1: Results for ProtoPNet using CUB-200 (top) and ImageNet-30 (bottom): (a) shows for a prototype from a given class

- the nearest training and test patches (yellow box) with similarity score based activation maps, (b) shows all the 10 prototypes

(yellow box) learnt for the respective classes in (a).

more for ﬁne-grained image datasets like CUB which

require expert knowledge.

4.1.2 Semantically Disentangled

Since ProtoPNet and ProtoTree learn the prototypes

differently, here we analyse whether they successfully

learn distinct prototypes corresponding to distinct se-

mantically relevant parts.

ProtoPNet: In line with the ProtoPNet implemen-

tation, 10 prototypes are learnt per class as shown in

Figure 1(b). We observe that the learnt prototypes

are often redundant, i.e. similar prototypes or pro-

totypes looking at similar parts. E.g. for the west-

ern grebe class- prototypes 4, 5, 6, 9, and 10 show

neck parts, similarly red-breasted merganser shows

repeated neck (1, 9) and background prototypes (2, 3),

even from the same training image. Most repeated

prototypes do not add a different perspective in terms

of looking at different details of a semantic part, e.g.

there are multiple ski parts (prototypes 3, 7, and 10)

in snowmobile. Overall, prototypes need to be more

distinct and diverse to ensure complete mutually ex-

clusive representation of the entire class. The redun-

dancy could be due to too many pre-determined pro-

totypes for each class. Thus, we need methods better

suited to ﬁne-tune the optimal number of prototypes

to the given dataset and respective classes.

ProtoTree: In Figure 3, although much lesser

number of prototypes are learnt (10% of ProtoPNet)

avoiding redundancy and fewer background proto-

types, most of them are not class-speciﬁc thus quite

semantically disentangled over the whole dataset.

The prototypes are so diverse that it is difﬁcult to se-

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

882

Prototype

Nearest training images

with similarity-based LRP heatmap

Nearest test images

with similarity-based LRP heatmap

Western grebe

Red-breasted

merganser

Figure 2: Results using PRP to enhance interpretations of ProtoPNet for corresponding CUB-200 classes. The highlighted

regions in red correspond to maximum positive activations corresponding to each prototype.

Ground truth:

Western grebe

Present

Similarity 0.993

Present

Similarity 0.994

Present

Similarity 0.990

Present

Similarity 1.000

Present

Similarity 0.967

Western grebe

CUB-200

Ground truth:

Snowmobile

Present

Similarity 1.000

Present

Similarity 0.997

Present

Similarity 0.997

Present

Similarity 1.000

Absent

Similarity 0.072

Present

Similarity 0.999

Snowmobile

ImageNet-30

Figure 3: Results from ProtoTree on CUB-200 (top) and ImageNet-30 (bottom). For each test image, corresponding path

taken in the decision tree towards ﬁnal prediction is shown. The node prototypes at each decision-making stage are shown in

yellow boxes in their respective source images along with similarity scores to respective matching parts in test images. Absent

node prototypes are marked in red.

mantically correlate to the matching parts of class-

speciﬁc test images, thus providing limited interpre-

tations towards learning semantics of any particular

class.

4.1.3 Semantically Transformation Invariant

Given a prototype, say - the head of a bird, it is

essential that our methods learn representations for

these prototypes irrespective of variations that appear

for this particular entity across the entire dataset, i.e.

prototypes should be transformation invariant. Since

these methods use L

similarity in the feature space

for matching relevant test image patches, it is impor-

tant to inspect this property using transformed (ro-

tated and cropped) test samples.

ProtoPNet: In Figure 4, we show how well pro-

totypes are recognised given a transformed version of

the test image. We show the top 3 prototypes given

a test image and respective patches they activate in

terms of maximum similarity score. We see that given

a cropped head of western grebe, 2 out of the top 3

closest prototypes belong to different classes. Simi-

larly, given a rotated version of this same test image

by 25

◦

, one of the closest prototypes is a background

prototype from a different class.

ProtoTree: Repeating above experiments with

ProtoTree, we note that the node prototypes and the

path in the tree for the transformed test images did

not change, indicating ProtoTree to be more robust to

image transformations than ProtoPNet. Thus, we do

not show these results to save space and avoid redun-

dancy. Since ProtoTree and ProtoPNet use a different

set of augmentations during training, we also trained

a ProtoPNet with the augmentations applied in Pro-

toTree (different crops etc.), but this did not improve

the performance of ProtoPNet against transforma-

tions. Possibly, ProtoTree learns much fewer proto-

types and thus larger semantic distances between pro-

totypes, making it robust to smaller semantic transfor-

mations. However, this needs further investigation.

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models

883

Test image Nearest prototypes

Corresponding similarity-based

activation maps in the test image

Original

Crop

Rotate 25°

Figure 4: ProtoPNet results corresponding to transformations (crop, rotate 25

◦

) for a sample test image (western grebe)

showing the nearest 3 prototypes and their respective region of activations in the given test image. Yellow boxes show the

prototypes and images in red box show prototypes taken from a different class than western grebe.

4.1.4 Relevant to the Learnt Task

Considering the classiﬁcation task as a ‘Guess who?’

game where by looking at the learnt prototypes, can

we guess the collective class they belong to?

ProtoPNet: As observed earlier, the 10 proto-

types for each class shown in 1(b) are often redun-

dant and do not always represent all the distinct parts

of their respective classes. Nonetheless, some proto-

types do provide hints towards the respective classes

to make an informed guess like the white neck pro-

totype for western grebe hints at a bird with a white

neck, or the prototypes showing a black neck, head or

wings for rusty blackbird indicate at least a black bird.

However, whether they sufﬁciently represent their re-

spective class remains doubtful, particularly for ﬁne-

grained dataset. For more generalised datasets like

ImageNet-30, the prototypes themselves are quite di-

verse and easier to recognise, often including the en-

tire object in question, e.g. snowmobile prototypes

(1, 8) showing the whole vehicle. There are several

instances of background prototypes (2, 4, 5 in snow-

mobile) which might provide some context to recog-

nise a given class, but in general add to redundancy.

Lastly, it is up to the interpreter to make sense out of

these disjoint bits of information.

ProtoTree: ProtoTree itself does not provide

class-speciﬁc prototypes, thus all the node prototypes

in a given path which are marked as ‘present’ (see

Figure 3) are often not relevant directly to the class in

question except for the last few prototypes, e.g. the

bird’s eye or the neck for western grebe. Looking at

the prototypes of the snowmobile class, one would al-

most guess it as an airplane class except for the last

prototype with snow. Thus, while prototypes from

ProtoPNet provide some reliable hints as compared

to ProtoTree, both methods perform insufﬁciently in

a ‘Guess who?’ game.

4.2 Quantitative Results Based on User

Study

To further validate our observations in Section 4.1, we

collected statistics based on human assessment of pro-

totypes of natural images (10 classes of ImageNet-30)

to avoid the need of expert knowledge for ﬁne-grained

datasets. The study comprised of 2 experiments with

15 users: (1) given prototypes, users were required

to identify the class (‘Guess who?’ game) from the

option of 10 classes and (2) given each prototype

and respective class, users were asked to determine

- ‘Whether the given prototype was useful for iden-

tifying the class?’ and ’Whether the concept shown

in prototype is somehow repeating and redundant?’.

Users had to choose either ‘Yes/ No’. For the lat-

ter question, often irrelevant prototypes were marked

non-redundant, i.e. ‘No’. Figure 5 summarises the re-

sults from all experiments, which could be interpreted

as follows based on the questions asked to the users:

‘Guess Who?’ - Average class prediction accuracy

for ProtoPNet was much higher (98%) as compared

to ProtoTree (55%). This is as expected as ProtoP-

Net prototypes could easily be guessed from given ten

classes as common sub-parts of natural images. How-

ever, for ProtoTree, many of the prototypes do not be-

long to the class and the initial prototypes on a given

branch of a tree are often general and irrelevant to the

class.Thus, these prototypes were often not semanti-

cally relevant or human understandable and thus, dif-

ﬁcult to identify leading to poor prediction accuracy

for ProtoTree.

Prototype usefulness - Only 27% prototypes of

ProtoPNet and 20% of ProtoTree, were found totally

useful (100%) for identifying the class. This leaves a

lot of future scope to generate semantically relevant

and yet diverse prototypes that represents the given

class sufﬁciently for conﬁdent human interpretation.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

884

Class prediction accuracy Prototype usefulness Prototype non-redundancy

Figure 5: Quantitative results from user study conducted on prototypes from ImageNet-30 classes.

Prototype non-redundancy - Since we had ob-

served a lot of redundant or repeating prototypes

in cases of ﬁne grained datasets like CUB-200,

hence this experiment was conducted. However, as

we noted earlier that for natural images, the pro-

totypes are much more diverse and non-redundant.

Hence, for this experiment, it leaves an ambiguity

whether the prototypes were actually non-repeating or

whether they were found irrelevant/meaningless and

thus marked as non-redundant (15% for ProtoPNet

and 20.6% for ProtoTree).

Using the above statistics, we could further con-

ﬁrm the need for better methods that can generate

truly human-interpretable prototypes that are seman-

tically relevant, disentangled as well as sufﬁcient to

identify a given class.

4.3 Analysis on Synthetic Data

In previous sections, we observed that it is difﬁ-

cult to obtain human-interpretable prototypes with the

wide range of complexities associated with real-world

datasets, like the optimum number of prototypes re-

quired to deﬁne a class, varying semantics, cluttered

background, overlapping concepts, etc. Thus, we cre-

ated synthetic datasets (3D-Shapes) in a controlled

setting where each shape can be related a prototypical

concept and re-evaluated the performance of above

methods. The 3D-Shapes datasets consist of com-

binations of rendered 3D shapes in varying arrange-

ments(Johnson et al., 2017).

Dataset with overlapping concepts (V

): This

dataset consists of 3 classes with 3 non-exclusive

shapes each, resembling the original ﬁne-grained

classiﬁcation setting, where Class: cube, sphere,

cone; Class 1: sphere, cylinder, icosphere; Class 2:

cone, torus, icosphere. Exemplary results for Pro-

toPNet and ProtoTree are given in Figure 6. Ideally,

Class 0

Class 1

Class 2

Prototype

(a) ProtoPNet

Absent

Present

Absent

Present

Absent

Present

(b) ProtoTree

Figure 6: Results using synthetic 3D-Shapes dataset (V

)

showing in (a) prototypes from each class using ProtoP-

Net and (b) the node prototypes along with decision tree

(depth=2) using ProtoTree. Yellow boxes show the proto-

types.

each shape in a class should correspond to a proto-

type. Like in Section 4.1 for ProtoPNet, we observe

redundant repeating prototypes even in this simpliﬁed

setting. Also all the prototypes of one class focus

on the background, however they still contribute to

a high test accuracy. The learnt prototypes are not se-

mantically relevant in a way humans would classify

3D shapes. Often they focus on semantically mixed

patches like parts of both cube and cone in class 0.

Similarly, in ProtoTree the decision paths using the

highlighted prototypes do not follow human logic, as

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models

885

e.g. the path to class 2 is reached via 2 absent pro-

totypes (partial icosphere, sphere), whereas the ico-

sphere should belong to class 2. Instead it arrives

at class 2 by eliminating images from class 1 and

0.Thus, the prototypes are not semantically relevant

for classiﬁcation nor matching to the class.

Dataset with non-overlapping concepts (V

): This

dataset is designed to be even simpler with 3 classes

composed of 2 shapes each which are mutually ex-

clusive with other classes, given as- Class 0: cube,

sphere; Class 1 : cylinder, cone ; Class 2 : torus,

icosphere. For each class, we expect that two dis-

tinct prototypes corresponding to each shape should

be learnt. For ProtoPNet, we observe that both pro-

totypes for each class are similar (redundant) which

is relevant to the classiﬁcation task as there is no in-

centive to learn the other shape; however, for prac-

tical use-cases we expect it to learn distinct and di-

verse prototypes. With limited number of concepts,

no more background prototypes are observed, though

the learnt prototypes often focus on mixed patches

and are not semantically human-understandable. In

ProtoTree, the decision tree mostly follows a logical

structure es expected by a human. However, some

prototypes are still not human-understandable as they

do not match to the corresponding indicated parts, for

e.g. the prototype with partial edges of the cube some-

how ﬁnds the sphere in the test image as the corre-

sponding matching part. For limited page limit, we

do not provide these ﬁgures.

Ideally, recognising a class based on interpretable

prototypes as prior evidence would help us make

more informed classiﬁcation decisions particularly

for safety-critical use-cases.

4.4 Application for Real-World Tasks

Taking cue from the above mentioned properties,

where we assume ideally prototypes are human-

interpretable and semantically disentangled, one po-

tential real-world application could be to distinguish

OOD samples from samples belonging to ID classes.

It is assumed that OOD samples would have very dif-

ferent prototypes as compared to ID. A simple ap-

proach would be to distinguish test ID and OOD sam-

ples based on their L

distances to the nearest proto-

types. We train a ProtoPNet on 150 (out of 200) bird

classes from the CUB-200 dataset as ID training data.

The remaining 50 bird classes serve as ‘Near OOD’

data. As the prototypes from this OOD data are still

from the birds dataset, they are expected to be seman-

tically similar to ID prototypes. SVHN (Netzer et al.,

2011), a dataset consisting of housing numbers, is

taken as ‘Far OOD’ data. The Area Under ROC curve

Figure 7: Histogram showing distribution of L

distances

to closest prototypes for Near vs Far OOD samples for a

model trained on ﬁrst 150 classes of CUB-200 dataset.

(AUROC) provides an evaluation metric for OOD de-

tection and the results are shown in Figure 7. We

observe a lot of OOD samples having closely over-

lapping L

distances in ‘Near OOD’ setting, which is

concurrent with the fact that OOD prototypes are very

similar to ID prototypes and thus performs poorly in

terms of AUROC (69.1%). In ‘Far OOD’ setting,

SVHN samples are distinctly separated in terms of L

distances from training prototypes, which leads to an

AUROC of 95.8% in terms of OOD detection. We

remark that this approach might not be an absolute

representative of the separation of ID and OOD pro-

totypes in the feature space.

5 CONCLUSIONS AND FUTURE

WORK

In this work, we have assessed the interpretability

of the prototypes learnt from various prototype-based

IBD methods in terms of visual relevance to humans.

To that end, we ﬁrst deﬁned a set of desired prop-

erties of the prototypes as a basis for our analysis

of three different approaches: ProtoPNet, ProtoTree

and Prototypical Relevance Propagation (PRP). We

found ProtoPNet generates somewhat relevant proto-

types but suffers from a lot of redundancy and a lack

of semantically distinct prototypes. ProtoTree pro-

duces semantically diverse prototypes which are less

redundant but mostly not relevant. PRP addresses the

imprecise upsampling of ProtoPNet but does not con-

clusively contribute to better interpretability. Overall,

standalone prototypes individually (without matching

location context) are mostly not human-interpretable

and there is still a long way to go.

Potential future work should focus more on im-

proving the quality of the learnt prototypes in terms

of valuable human-understandable interpretations as

well as explore techniques to diversify the prototypes

to avoid redundancy. One way to improve quality

of explanations as well as improve trustworthiness

in high-stake decisions would be to utilise human

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

886

feedback during the learning phase to identify useful

prototypes as a potential next step. This could also

strengthen the need to demonstrate and validate which

properties are actually required for interpretability

and for effective internal assessment of models. As

observed already, given relevant prototypes, OOD de-

tection could largely beneﬁt from interpretable proto-

types which calls for ﬁnding better techniques, partic-

ularly in ‘Near OOD’ regime.

REFERENCES

Adadi, A. and Berrada, M. (2018). Peeking Inside the

Black-Box: A Survey on Explainable Artiﬁcial Intel-

ligence (XAI). IEEE Access, 6:52138–52160.

Bach, S., Binder, A., Montavon, G., Klauschen, F., M

uller,

K.-R., and Samek, W. (2015). On Pixel-Wise Ex-

planations for Non-Linear Classiﬁer Decisions by

Layer-Wise Relevance Propagation. PLOS ONE,

10(7):e0130140.

Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., and Su,

J. K. (2019). This Looks Like That: Deep Learning for

Interpretable Image Recognition. In Proc. NeurIPS,

volume 32. Curran Associates, Inc.

Fang, Z., Kuang, K., Lin, Y., Wu, F., and Yao, Y.-F. (2020).

Concept-based Explanation for Fine-grained Images

and Its Application in Infectious Keratitis Classiﬁca-

tion. In Proc. ACM Multimedia. Association for Com-

puting Machinery.

Gautam, S., H

ohne, M. M.-C., Hansen, S., Jenssen, R.,

and Kampffmeyer, M. (2021). This looks more like

that: Enhancing Self-Explaining Models by Prototyp-

ical Relevance Propagation. arXiv:2108.12204.

Hendrycks, D., Mazeika, M., Kadavath, S., and Song, D.

(2019). Using self-supervised learning can improve

model robustness and uncertainty. arXiv preprint

arXiv:1906.12340.

Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L.,

Lawrence Zitnick, C., and Girshick, R. (2017). Clevr:

A diagnostic dataset for compositional language and

elementary visual reasoning. In Proceedings of the

IEEE Conference on Computer Vision and Pattern

Recognition (CVPR).

Koh, P. W., Nguyen, T., Tang, Y. S., Mussmann, S., Pierson,

E., Kim, B., and Liang, P. (2020). Concept Bottleneck

Models. arXiv:2007.04612.

Li, O., Liu, H., Chen, C., and Rudin, C. (2018). Deep Learn-

ing for Case-Based Reasoning Through Prototypes: A

Neural Network That Explains Its Predictions. Proc.

AAAI, 32(1).

Nauta, M., van Bree, R., and Seifert, C. (2021). Neural

Prototype Trees for Interpretable Fine-grained Image

Recognition. In Proc. CVPR, pages 14928–14938,

Nashville, TN, USA. IEEE.

Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and

Ng, A. Y. (2011). Reading digits in natural images

with unsupervised feature learning.

Nguyen, A., Yosinski, J., and Clune, J. (2019). Under-

standing Neural Networks via Feature Visualization:

A Survey. In Explainable AI: Interpreting, Explain-

ing and Visualizing Deep Learning, Lecture Notes

in Computer Science, pages 55–76. Springer Interna-

tional Publishing, Cham.

aez, A. (2019). The pragmatic turn in explainable artiﬁcial

intelligence (XAI). Minds and Machines, 29(3):441–

459.

Rudin, C. (2019). Stop explaining black box machine learn-

ing models for high stakes decisions and use inter-

pretable models instead. Nature Machine Intelligence,

1(5):206–215.

Shrikumar, A., Greenside, P., and Kundaje, A. (2017).

Learning important features through propagating ac-

tivation differences. In Precup, D. and Teh, Y. W.,

editors, Proceedings of the 34th International Con-

ference on Machine Learning, volume 70 of Pro-

ceedings of Machine Learning Research, pages 3145–

3153. PMLR.

Stammer, W., Memmel, M., Schramowski, P., and Kerst-

ing, K. (2022). Interactive Disentanglement: Learn-

ing Concepts by Interacting with their Prototype Rep-

resentations. arXiv:2112.02290.

Sundararajan, M., Taly, A., and Yan, Q. (2017). Axiomatic

Attribution for Deep Networks. In Proc. ICML, pages

3319–3328. PMLR.

Tjoa, E. and Guan, C. (2020). A Survey on Explainable

Artiﬁcial Intelligence (XAI): Towards Medical XAI.

arXiv:1907.07374.

Wah, C., Branson, S., Welinder, P., Perona, P., and Be-

longie, S. (2011). The caltech-ucsd birds-200-2011

dataset.

Zhang, Q., Wu, Y. N., and Zhu, S.-C. (2018). Interpretable

Convolutional Neural Networks. In Proc. CVPR,

pages 8827–8836.

Zhang, Q.-s. and Zhu, S.-c. (2018). Visual interpretability

for deep learning: A survey. Frontiers Inf Technol

Electronic Eng, 19(1):27–39.

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Tor-

ralba, A. (2016). Learning Deep Features for Dis-

criminative Localization. In Proc. CVPR, pages 2921–

2929.

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classiﬁcation Models

887