aggregation. Future research directions lie in studying
the impact of different attack strategies on defense
mechanisms and finding a balance among resource
optimization, privacy protection, and defense
effectiveness (Wang et al., 2022).
There are potential privacy leakage risks in
various links of federated learning, such as parameter
exchange during the training process, unreliable
participants, and model release after training
completion. For example, attack methods such as data
reconstruction from gradients or inferring the source
of records based on intermediate parameters have
been proven feasible (Hu, Liu, & Han, 2019; Song,
Ristenpart, & Shmatikov, 2017). Different from
traditional centralized learning, federated learning
faces more complex internal attacks, which greatly
increases the difficulty of its privacy protection.
When studying the privacy protection problem in
federated learning, Liu found that the internal
attackers in federated learning include the terminals
participating in model training and the central server
(Liu, 2021). Compared with external attackers,
internal attackers have more training information and
stronger attack capabilities (Liu, 2021).
At present, the approaches to privacy protection in
federated learning are on the rise. Some scholars have
put forward three solutions to the privacy-protection
issue in federated learning: secure multi-party
computation, differential privacy, and homomorphic
encryption. This paper predominantly centers on the
applications of differential privacy and homomorphic
encryption techniques in safeguarding privacy for
federated learning. It also summarizes and explores
the most recent research advancements of relevant
technologies. In addition, the review content of this
paper encompasses the methods of applying
federated-learning privacy-protection technologies
grounded in differential privacy and homomorphic
encryption to the medical field and deliberates on the
future challenges and development of federated-
learning privacy-protection technologies.
2 THE CONCEPT OF
FEDERATED LEARNING
Traditional machine learning methods gather the data
of all clients for learning. However, with data privacy
and data security becoming issues, it is considered
unsafe to centralize the original data of clients. To
solve these problems, a new type of machine learning
method called federated learning, which protects
client data, has been proposed.
Federated learning is a distributed machine
learning technology. Its core feature is that during the
process of training a model, the original data of
participants always remains local, and collaborative
training is achieved only by exchanging model-
related intermediate data (such as model update
information, gradients, etc.) with the central server.
This is in sharp contrast to the "model remains
stationary, data moves" mode of traditional
centralized learning, and it is a new learning paradigm
of "data remains stationary, the model moves" (Liang,
2022). Its purpose is to break down data silos,
enabling all parties to fully utilize the knowledge
contained in multi-party data without exposing their
own data privacy, enhancing the model's performance
and maximizing the utilization of data value. For
example, in the medical and financial fields, different
institutions can jointly train models to improve
diagnostic accuracy or risk assessment capabilities
while protecting the privacy data of patients or clients.
Based on the distribution disparities in the feature
space and sample space of the participants' datasets,
federated learning can be categorized into horizontal
federated learning, vertical federated learning, and
federated transfer learning. Horizontal federated
learning is suitable in scenarios where the feature
spaces of the datasets of all parties exhibit substantial
overlap while the sample spaces have only a minor
degree of overlap. It usually involves joint training of
data with similar features from different users. For
example, numerous Android phone users, under the
coordination of a cloud server, train a shared global
input method prediction model based on their local
data, making use of the data diversity of different
users in the same feature dimension to improve the
adaptability of the model to different users' input
habits and prediction accuracy.
Vertical federated learning is fitting for
circumstances where the sample spaces display a high
degree of overlap, whereas the feature spaces have a
relatively small amount of overlap. It generally
involves the joint use of data generated by the same
batch of users in different institutions or business
scenarios. For example, a bank holds users' income
and expenditure records, while an e-commerce
platform possesses users' consumption and browsing
records. The two parties conduct joint training based
on the data of common users but different features to
build a more accurate model for tasks such as
customer credit rating, achieving cross-industry data
integration, and collaborative modeling.
Federated transfer learning mainly focuses on
datasets with little overlap in both the sample space
and the feature space. It uses transfer learning