Quantum Federated Learning for Image Classiﬁcation

Leo S

unkel, Philipp Altmann, Michael K

olle and Thomas Gabor

Institute for Informatics, LMU Munich, Germany

Keywords:

Federated Learning, Quantum Machine Learning, Distributed Learning, Quantum Networks.

Abstract:

Federated learning is a technique in classical machine learning in which a global model is collectively trained

by a number of independent clients, each with their own datasets. Using this learning method, clients are

not required to reveal their dataset as it remains local; clients may only exchange parameters with each other.

As the interest in quantum computing and especially quantum machine learning is steadily increasing, more

concepts and approaches based on classical machine learning principles are being applied to the respective

counterparts in the quantum domain. Thus, the idea behind federated learning has been transferred to the

quantum realm in recent years. In this paper, we evaluate a straightforward approach to quantum federated

learning using the widely used MNIST dataset. In this approach, we replace a classical neural network with

a variational quantum circuit, i.e., the global model as well as the clients are trainable quantum circuits. We

run three different experiments which differ in number of clients and data-subsets used. Our results demon-

strate that basic principles of federated learning can be applied to the quantum domain while still achieving

acceptable results. However, they also illustrate that further research is required for scenarios with increasing

number of clients.

1 INTRODUCTION

Privacy in the context of machine learning models has

become a greater concern in recent years and one ap-

proach to enhance the privacy of user data is federated

learning (McMahan et al., 2017). Using this tech-

nique, a global model (e.g., a neural network) is col-

lectively trained by a number of client models. Clients

do not reveal their data; instead they synchronize and

train the global model by other means (for example by

aggregating parameters or weights) while their indi-

vidual dataset is kept local, i.e., private. The concept

of federated learning is not new in classical machine

learning, the basic concept has been thoroughly dis-

cussed (Kone

y et al., 2016; McMahan et al., 2017;

Zhao et al., 2018; Yang et al., 2019) as well as a range

of problems, challenges and potential solutions iden-

tiﬁed and proposed (Geyer et al., 2017; Li et al., 2020;

Mammen, 2021; Lyu et al., 2020). While the ﬁeld is

still advancing in the classical domain, interest in ex-

tending these ideas into the quantum realm has been

steadily growing in recent years (Chen and Yoo, 2021;

Li et al., 2021; Chehimi and Saad, 2022; Kumar et al.,

2023). This allows for the rise of several possible ar-

chitectures and approaches, each with their own set of

challenges, problems and potential advantages. For

instance, clients could still communicate via classi-

cal networks, however, it would also be feasible to

incorporate the methodology into a quantum commu-

nication network (Chehimi et al., 2023), which could

provide certain beneﬁts such as secure quantum com-

munication channels. However, the speciﬁcs require

further investigation as both quantum federated learn-

ing and quantum communication networks are still in

their infancy.

Whether quantum federated learning is able to

provide advantages besides privacy or security is an-

other open question demanding more research. For

instance, how does the approach effect the trainabil-

ity of a quantum circuit, or how do both approaches

compare in terms of their ability to generalize to un-

seen data? These questions are already relevant and

subject to research for traditional training and learn-

ing approaches, so investigating these questions in a

federated learning setting will most likely also be im-

portant.

In this paper, we evaluate a simple quantum fed-

erated learning approach on an image classiﬁcation

problem, namely the task of recognizing images of

digits from the MNIST dataset. We divide the dataset

such that each subset contains images for two classes.

A global model is trained in a federated manner where

each client is a variational quantum circuit used for

binary classiﬁcation. We compare this approach to a

936

Sünkel, L., Altmann, P., Kölle, M. and Gabor, T.

Quantum Federated Learning for Image Classiﬁcation.

DOI: 10.5220/0012421200003636

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2024) - Volume 3, pages 936-942

ISBN: 978-989-758-680-4; ISSN: 2184-433X

regular variational quantum circuit trained in a tradi-

tional, i.e., non federated or distributed manner.

This paper is structured as follows. In Section

2 we discuss the background of quantum machine

learning and (quantum) federated learning. Related

work is discussed in Section 3 while we present our

experimental setup in Section 4. Results are presented

in Section 5. We conclude and give an outlook for fu-

ture work in Section 6.

2 BACKGROUND

In this section, we brieﬂy recap the fundamentals of

quantum machine learning (QML) and discuss feder-

ated learning (FL) as well as quantum federated learn-

ing (QFL).

2.1 Quantum Machine Learning

Over the years several QML algorithms have been

proposed by the research community, however, the

so-called variational quantum algorithm (VQA) ap-

proach based on variational quantum circuits (VQCs)

(Mitarai et al., 2018; Schuld and Killoran, 2019) ap-

pears to be the most popular and relevant one in the

current NISQ-era of quantum computing (QC), and

this approach is the one we employ as part of this

work. Thus, when we use the term QML we refer

to this approach.

A VQC is a parameterized quantum circuit con-

sisting of the following parts: (i) feature map, (ii) en-

tanglement, and rotation and (iii) measurement. In

(i) the classical data, i.e., features, are encoded into a

quantum state through the use of rotation gates where

a feature corresponds to the angle of rotation. This

is followed by a series of repeating layers consist-

ing of parameterized rotations and entangling gates

(e.g., CNOT gates), where the parameters of the rota-

tion gates are the weights to be optimized by a classi-

cal optimization algorithm. In the last step a number

of qubits are measured, resulting in classical values

which can in turn be used to derived the prediction for

a classiﬁcation task. An example VQC with 1 layer is

depicted in Figure 3. The design of the architecture

of the circuit is its own research question and we con-

sider circuits of the basic architecture described above

in this paper.

The optimization algorithm runs on a classical

computer, making this an hybrid iterative approach.

The circuit is initialized with features and weights,

executes on a quantum computer and returns some

measurement results that are interpreted in order to

establish a prediction, which can then be fed into the

optimizer resulting in a set of new updated weights.

The process continues until the preset number of it-

erations have been reached or some other termination

criteria is met. For a more in depth discussion of this

topic we refer to (Mitarai et al., 2018) and (Schuld

et al., 2020).

2.2 (Quantum) Federated Learning

We will summarize the main idea behind FL in this

section and refer for a more in depth discussion to

(Yang et al., 2019). After establishing FL in the clas-

sical setting, we discuss how to transfer these con-

cepts to the QC domain.

In a FL setting, a global model is collectively

trained by a number of client-models that have some

means of communication, for example over a com-

munication network. The clients itself may be trained

on different data-subsets or even entirely different

datasets. The data, however, is kept private, i.e., lo-

cal; the clients only exchange parameters or gradients

with the global model. Note that the details of the

exchange depends on the implementation, there exist

various techniques and approaches in the literature.

One of the main advantages and motivation is the pos-

sibility of collaboratively training a shared model by

multiple, potentially unknown clients while still keep-

ing sensitive data private.

The main approach is as follows. The global

model distributes its parameters (i.e., weights) to each

client. Then a client trains its model on its local

dataset for a number of epochs. After a deﬁned num-

ber of epochs, clients send their weights to the global

model which then aggregates all weights and updates

its own weights. The global model then distributes

the updated weights among the clients and the process

repeats until the maximal number of epochs has been

reached. The overall architecture of FL is illustrated

in Figure 1.

A straightforward approach to transfer these ideas

to the quantum realm would be to use a VQC as the

global model and a number of VQCs as client mod-

els. This is similar to the approach in (Chen and

Yoo, 2021), however, they use a hybrid model. These

VQCs may or may not have the same architecture.

The models weights can be exchanged over classical

communication channels. However, quantum com-

munication networks allow the transfer of quantum

states and could also be incorporated into the ap-

proach, yielding further possible advantages such as

secure quantum communication channels. Further ex-

tension to include blind quantum computing is a an-

other possible pathway to ensure privacy, as discussed

by (Li et al., 2021).

Quantum Federated Learning for Image Classiﬁcation

937

Global Model

Client 1

Dataset 1

Client 2

Dataset 2

Client 3

Dataset 3

Client 4

Dataset 4

Client 5

Dataset 5

Weight

aggregation

and update

Figure 1: Overview of an example FL architecture. In this

example, 5 clients train their model on their local dataset

and send their updated weights to the global model. The

global model aggregates all weights collected and updates

the global model which is then distributed among the clients

for the next iteration of training.

3 RELATED WORK

A federated QML approach based on a hybrid quan-

tum models is discussed in (Chen and Yoo, 2021).

The authors evaluate their approach on binary clas-

siﬁcation tasks and show that their approach yields

similar results than regular training. Incorporating

blind quantum computing is discussed in (Li et al.,

2021). They show, among other things, the training of

a VQC using blind quantum computing in the context

of FL. Privacy in QFL is discussed in (Kumar et al.,

2023) and (Rofougaran et al., 2023) while (Chehimi

and Saad, 2022) propose a quantum federated learn-

ing framework. In (Wang et al., 2023) the authors

discuss quantum federated learning over quantum net-

works, where the weights are communicated via tele-

portation. They evaluate their approach on a binary

classiﬁcation task. An overview of challenges and op-

portunities is given in (Chehimi et al., 2023).

4 EXPERIMENTAL SETUP

We discuss our approach to QFL and experimental

setup in this section. In our experiments, the global

model as well as all clients are VQCs, each with the

same circuit design. However, the speciﬁc qubits

measured vary, as discussed in detail in the follow-

ing sections. Each VQC is its own model with its own

optimizer and set of parameters. Each client is trained

on its own data sub-set, we will discuss this point in

more detail in Section 5.

4.1 Approach

Our approach revolves around the Clients training

their model on their own dataset and update their lo-

Global Model

(VQC)

VQC

Digits

0 and 1

VQC

Digits

2 and 3

VQC

Digits

4 and 5

VQC

Digits

6 and 7

VQC

Digits

8 and 9

Weight

aggregation

and update

Figure 2: Example QFL architecture.

cal parameters independently of each other. Every

n epochs, the parameters of every client are aggre-

gated and used update the global model. The updated

parameters are determined by calculating the mean

over all clients. The parameters of the updated global

model are then used to also update the clients param-

eters, i.e., all clients are synchronized with the global

model. The overall process for the experiment with 5

clients and subsets is depicted in Figure 2 while the

VQC architecture employed is shown in Figure 3 and

is discussed below.

Note that we only exchange models parameters

and a classical communication channel is sufﬁcient

for this. More speciﬁcally, a network is not necessary

either; the approach can be executed entirely local,

as is done in our experiments. Incorporating the ap-

proach into a quantum network is out of scope and is

a potential avenue for future work.

We evaluate our approach for a classiﬁcation task

using the MNIST dataset, which contains images of

hand-written digits. Each client is given a subset of

this dataset containing the images for two digits and

each client is given a distinct dataset. In our experi-

ments, we train several clients where the ﬁrst one uses

digits 0 and 1, the second one 2 and 3 and so forth.

Figure 3: Example circuit architecture employed in our ex-

periments with depth 1. Note that in our experiments we

used circuits with a higher depth, however, the same overall

architecture pattern.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

938

Note that each VQC is used for binary classiﬁcation

while the global model is evaluated on all classes after

training has completed.

4.2 Variational Quantum Circuit

All models in our experiments are VQCs with the

same architecture. All features are encoded using am-

plitude embedding, this is followed by a simple ansatz

consisting of repeating layers of CNOT and parame-

terized rotation gates. We measure two qubits in the

binary classiﬁcation case. In each VQC, the digits

it classiﬁes also correspond to the index of the qubit

that is measured. That is, the VQC trained on the sub-

set consisting of images of digits 0 and 1 measures

qubits 0 and 1 while the VQC trained on digits 2 and

3 measures qubits 2 and 3. The measurement results

are interpreted as class probabilities and used for la-

bel prediction. The number of layers is determined

by the depth parameter, the parameters used in our

experiments are discussed in the next section.

4.3 Data and Experiments

We brieﬂy review the conﬁguration used in our exper-

iments while an overview of the parameters is given

in Table 1. We used the MNIST dataset in our train-

ing, which contains images of size 28x28, resulting in

784 features for each image. With amplitude embed-

ding, a circuit with 10 qubits is sufﬁcient to embed all

features. Note that we do not use all training data in

our experiments, instead roughly 1000 samples were

used per epoch for each model. Pytorch and Penny-

Lane (Bergholm et al., 2018) were used to implement

our experiments. Using a circuit with 10 qubits and a

depth of 15 results in 150 trainable parameters in our

approach, as we use a single Ry rotation per qubit in

each layer. We evaluate our approach in three differ-

ent experiments. In the ﬁrst, a global model is trained

by two clients with their own respective data subset

for binary classiﬁcation. For instance, one client is

trained on a data subset consisting of images depict-

ing the digits 0 and 1 while the other clients sub-

set contains digits 2 and 3, thus the global model is

trained on four classes in total. In the second experi-

ment we use three clients, the additional client is then

trained on digits 4 and 5 while in the third experiment

we use all 10 digits and train 5 clients.

5 RESULTS

In this section, we discuss the results of experiments

conducted as part of this work. We ﬁrst present the re-

Table 1: Experiment conﬁguration.

Parameter

Qubits 10

Depth 15

Parameters 150

Epochs 30

Batch size 40

Models synchronized every n generations 3

Seeds 4

Clients 2, 3 or 5

sults from the baseline model and then continue with

the QFL approach and conclude with a comparison of

both.

5.1 Baseline

As baseline we use VQCs for binary classiﬁcation,

each trained on a subset of the data. More speciﬁcally,

one baseline model is trained to classify the digits 0

and 1, another to the digits 2 and 3 and so forth. Note

that each model is trained individually, i.e., not in a

federated or distributed manner, and uses the same

hyper-parameters as in the QFL experiments. The

mean training accuracy for each of this models is de-

picted in Figure 4 and the loss is shown in Figure

5, both aggregated over all seeds. Test results of the

VQC trained on all classes are also discussed below.

5.2 Quantum Federated Learning

We discuss the results from our QFL experiments

next. However, ﬁrst a note on how the results are

aggregated. Recall that client models are trained on

their subset of the training data, weights are aggre-

gated to form the global model which is subsequently

distributed to synchronize the clients. The training

results of the QFL approach depicted below are the

Figure 4: Comparison of training accuracy of binary classi-

ﬁcation from the baseline-model.

Quantum Federated Learning for Image Classiﬁcation

939

Figure 5: Comparison of training loss of binary classiﬁca-

tion from the baseline-model.

mean of all client models, aggregated over all seeds.

However, we also depict and discuss the test results

from the global model.

Figure 6 shows the mean training accuracy

achieved with the QFL approach while the loss is

depicted in Figure 7. In both ﬁgures (i.e., accuracy

and loss plots) it can be seen that the accuracy and

loss ”jump” every few epochs. This might be due

to fact that every n generations (parameter value is

shown in Table 1), the global model is updated and

distributed. Adjusting this parameter may improve

the models performance.

The mean training accuracy of the baseline and

global model for 4 digits is shown in Figure 8 and

for 6 digits is shown in Figure 9. Also not that here

the results of binary classiﬁcation for each subset are

aggregated for both models.

In Figure 11 the test accuracy aggregated from the

performance on each subset from the global model is

shown. That is, the ﬁgure depicts the mean test accu-

racy for each binary classiﬁcation task (i.e., 0 vs 1, 2

vs 3 and so forth) from the global model. In Figure

12 the test results of the global model on all classes is

Figure 6: Mean train accuracy (QFL).

Figure 7: Mean train loss (QFL).

Figure 8: Comparison of aggregated mean training accu-

racy between baseline and QFL for 4 digits.

depicted.

5.3 Discussion

From these results one can see that while the baseline

achieves the best performance, the QFL approach also

seems to achieve acceptable results, especially in the

Figure 9: Comparison of aggregated mean training accu-

racy between baseline and QFL for 6 digits.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

940

Figure 10: Comparison of aggregated mean training accu-

racy between baseline and QFL for 10 digits.

Figure 11: Mean test accuracy over all subsets (Global

model.)

experiment with 2 clients and data-subsets. However,

the more clients and data-subsets are added, the dis-

crepancy increases, compare Figures 8 and 10. This

is also illustrated in Figures 12 and 13. Increasing the

number of epochs or adjusting other hyper-parameters

may improve the performance though. Also recall

that in all experiments we limited the number of train-

ing samples per epoch to roughly 1000, allowing

more samples per epoch may also yield better results.

The main aim of this study was to evaluate the poten-

tial of QFL on a simple example with a straightfor-

ward approach by adopting methods from classical

FL. Incorporating more advanced methods from FL

or investigating new techniques suitable for the quan-

tum domain may be required to achieve superior per-

formance and results.

Figure 12: Test accuracy of global model on all classes.

6 CONCLUSION

In this paper, we discussed and evaluated a basic ap-

proach that transfers concepts from classical FL into

the domain of quantum computing. We applied an

approach in which the global model as well as the

clients are VQCs that transfer their parameters (i.e.,

weights) in a classical manner. We ran three exper-

iments in the domain of image classiﬁcation on the

MNIST dataset. The nature of the experiments varied

in terms of number of clients and data-subsets, that is,

we used 2, 3, and 5 clients in different experiments

to train a global model. Overall these experiments

yield acceptable results, however, as the number of

clients increase (and thus the number of data-subsets),

the performance drops steadily. There could be nu-

merous reasons and countermeasures for this. For in-

stance, increasing the number of training epochs as

well as the training samples may increase the perfor-

mance. Using different circuits with more parameters

may also help. Though more research is required in

Figure 13: Test accuracy of the baseline on all classes.

Quantum Federated Learning for Image Classiﬁcation

941

this direction.

Incorporating QFML approaches that use quan-

tum communication or quantum networks is a further

important research direction. This is for instance dis-

cussed in (Chehimi et al., 2023) and (Wang et al.,

2023). As the aim in this paper was to evaluate a

straightforward approach, in future more elaborate

schemes as well as different domains and use-cases in

future communication networks should be explored.

ACKNOWLEDGEMENTS

Sponsored in part by the Bavarian Ministry of Eco-

nomic Affairs, Regional Development and Energy as

part of the 6GQT project.

REFERENCES

Bergholm, V., Izaac, J., Schuld, M., Gogolin, C., Ahmed,

S., Ajith, V., Alam, M. S., Alonso-Linaje, G., Akash-

Narayanan, B., Asadi, A., et al. (2018). Pennylane:

Automatic differentiation of hybrid quantum-classical

computations. arXiv preprint arXiv:1811.04968.

Chehimi, M., Chen, S. Y.-C., Saad, W., Towsley, D., and

Debbah, M. (2023). Foundations of quantum feder-

ated learning over classical and quantum networks.

IEEE Network.

Chehimi, M. and Saad, W. (2022). Quantum federated

learning with quantum data. In ICASSP 2022-2022

IEEE International Conference on Acoustics, Speech

and Signal Processing (ICASSP), pages 8617–8621.

IEEE.

Chen, S. Y.-C. and Yoo, S. (2021). Federated quantum ma-

chine learning. Entropy, 23(4):460.

Geyer, R. C., Klein, T., and Nabi, M. (2017). Differentially

private federated learning: A client level perspective.

arXiv preprint arXiv:1712.07557.

Kone

y, J., McMahan, H. B., Yu, F. X., Richt

arik, P.,

Suresh, A. T., and Bacon, D. (2016). Federated learn-

ing: Strategies for improving communication efﬁ-

ciency. arXiv preprint arXiv:1610.05492.

Kumar, N., Heredge, J., Li, C., Eloul, S., Sureshbabu,

S. H., and Pistoia, M. (2023). Expressive variational

quantum circuits provide inherent privacy in federated

learning. arXiv preprint arXiv:2309.13002.

Li, T., Sahu, A. K., Talwalkar, A., and Smith, V. (2020).

Federated learning: Challenges, methods, and fu-

ture directions. IEEE signal processing magazine,

37(3):50–60.

Li, W., Lu, S., and Deng, D.-L. (2021). Quantum fed-

erated learning through blind quantum computing.

Science China Physics, Mechanics & Astronomy,

64(10):100312.

Lyu, L., Yu, H., and Yang, Q. (2020). Threats to federated

learning: A survey. arXiv preprint arXiv:2003.02133.

Mammen, P. M. (2021). Federated learning: Opportunities

and challenges. arXiv preprint arXiv:2101.05428.

McMahan, B., Moore, E., Ramage, D., Hampson, S., and

y Arcas, B. A. (2017). Communication-efﬁcient learn-

ing of deep networks from decentralized data. In Ar-

tiﬁcial intelligence and statistics, pages 1273–1282.

PMLR.

Mitarai, K., Negoro, M., Kitagawa, M., and Fujii, K.

(2018). Quantum circuit learning. Physical Review

A, 98(3):032309.

Rofougaran, R., Yoo, S., Tseng, H.-H., and Chen, S. Y.-

C. (2023). Federated quantum machine learning with

differential privacy. arXiv preprint arXiv:2310.06973.

Schuld, M., Bocharov, A., Svore, K. M., and Wiebe, N.

(2020). Circuit-centric quantum classiﬁers. Physical

Review A, 101(3):032308.

Schuld, M. and Killoran, N. (2019). Quantum machine

learning in feature hilbert spaces. Physical review let-

ters, 122(4):040504.

Wang, T., Tseng, H.-H., and Yoo, S. (2023). Quantum

federated learning with quantum networks. arXiv

preprint arXiv:2310.15084.

Yang, Q., Liu, Y., Chen, T., and Tong, Y. (2019). Federated

machine learning: Concept and applications. ACM

Transactions on Intelligent Systems and Technology

(TIST), 10(2):1–19.

Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., and Chandra,

V. (2018). Federated learning with non-iid data. arXiv

preprint arXiv:1806.00582.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

942