FedBID and FedDocs: A Dataset and System for Federated Document

Analysis

Daniel Perazzo

1,†

, Thiago de Souza

1,†

, Pietro Masur

, Eduardo de Amorim

, Pedro de Oliveira

Kelvin Cunha

, Lucas Maggi

, Francisco Sim

oes

1,2

, Veronica Teichrieb

and Lucas Kirsten

Voxar Labs, Centro de Inform

atica, Universidade Federal de Pernambuco, Recife/PE, Brazil

Visual Computing Lab, Departamento de Computac¸

ao, Universidade Federal Rural de Pernambuco, Recife/PE, Brazil

HP Inc., Porto Alegre/RS, Brazil

Keywords:

Federated Learning, Document Analysis, Privacy, Dataset.

Abstract:

Data privacy has recently become one of the main concerns for society and machine learning researchers. The

question of privacy led to research in privacy-aware machine learning and, amongst many other techniques,

one solution gaining ground is federated learning. In this machine learning paradigm, data does not leave the

user’s device, with training happening on it and aggregated in a remote server. In this work, we present, to our

knowledge, the ﬁrst federated dataset for document classiﬁcation: FedBID. To demonstrate how this dataset can

be used for evaluating different techniques, we also developed a system, FedDocs, for federated learning for

document classiﬁcation. We demonstrate the characteristics of our federated dataset, along with different types

of distributions possible to be created with our dataset. Finally, we analyze our system, FedDocs, in our dataset,

FedBID, in multiple different scenarios. We analyze a federated setting with balanced categories, a federated

setting with unbalanced classes, and, ﬁnally, simulating a siloed federated training. We demonstrate that FedBID

can be used to analyze a federated learning algorithm. Finally, we hope the FedBID dataset allows more research

in federated document classiﬁcation. The dataset is available in https://github.com/voxarlabs/FedBID.

1 INTRODUCTION

The fast development of learning-based techniques,

especially Deep Learning (DL), inspired new

applications in different areas of computing (Xu et al.,

2019). These applications became more accurate and

efﬁcient through increasingly robust learning neural

networks and access to more data. For example,

numerous applications on computer vision, using deep

learning techniques, emerged in recent years in ﬁelds

such as robotics, autonomous driving, and assistive

technologies (S

underhauf et al., 2018).

Nevertheless, for distributed training with edge

devices, deep learning has traditionally followed a

centralized training approach, in which third-party

servers were responsible for the entire training

process and where all data is sent and stored in the

server (Huang et al., 2017). In this type of training,

the user’s data needs to be sent, stored, and processed

on servers to train the neural network, which needs

to be retrained continuously over time (Mayr et al.,

†

Denotes equal contribution

2019). However, this type of centralized training is

highly susceptible to security breaches due to all data

being stored in a single server.

In this scenario, Federated Learning (FL) is a

new paradigm to distribute neural network training

with data privacy as a main requirement (McMahan

et al., 2017). In this setting, part or all of the training

process is addressed in remote devices, minimizing the

exchange of sensitive data from users since customers’

data is never sent to the network or accessed by an

external agent directly. The models are updated over

time, adding new information that reﬂects the data

distribution for each user.

Over the last few years, different strategies and

architectures have been proposed in FL for different

areas (Liu et al., 2020; Li et al., 2019; Zhu et al.,

2021). However, consolidated benchmarks and

datasets are lacking in the FL literature to validate

applications (Caldas et al., 2018). In most cases,

datasets are either not publicly available, overly simple,

or do not have adequate distribution to assess FL

challenges (Caldas et al., 2018).

In this context, we present a new dataset that

Perazzo, D., de Souza, T., Masur, P., de Amorim, E., de Oliveira, P., Cunha, K., Maggi, L., Simões, F., Teichrieb, V. and Kirsten, L.

FedBID and FedDocs: A Dataset and System for Federated Document Analysis.

DOI: 10.5220/0011658700003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP, pages

551-558

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

551

contains a well-deﬁned non-IID distribution. The

BID (de S

a Soares et al., 2020) is a dataset for identity

document classiﬁcation, containing annotations for

different types of Brazilian documents. Through

analyzing BID, we could identify data patterns and

classes that we can map to existing challenges

in Federated Learning. In this way, we propose

FedBID, a new dataset in federated document

classiﬁcation containing

21, 600

examples annotated

from an accurate non-IID distribution, to evaluate

new challenges not mapped by previous datasets.

Among these challenges, FedBID can consider test

scenarios with a severe imbalance of examples in

different users, with variations related to classes,

number of samples, image quality, and annotation

reliability. Furthermore, FedBID was designed to

handle various tasks such as labeling the document

type, orientation, source organization, and document

owner. We also constructed a system, FedDocs, for

federated document classiﬁcation and evaluated the

accuracy of FedDocs in our dataset.

As the main contributions, we highlight the

following:

•

FedBID, the ﬁrst federated document classiﬁcation

dataset (Section 4);

•

FedDocs, a system for federated learning for

document classiﬁcation (Section 3);

•

Finally, we evaluate our proposed system on

FedBID, with various conﬁgurations for federated

learning. (Section 5).

2 RELATED WORKS

Federated learning is a new paradigm for decentralized

learning (McMahan et al., 2017) and a relatively

new research ﬁeld, with many challenges and open

problems still present in the research community, as

shown in recent surveys (Kairouz et al., 2021; Li et al.,

2020). Below we list research related to our work.

Federated Learning Datasets. Due to its

decentralized training procedure, federated learning

datasets differ from traditional machine learning

datasets, requiring data partitions for various devices.

Many researchers have proposed and released datasets

and evaluation benchmarks for federated learning.

An essential factor for these datasets is for them to

mimic non-IID distributions since they frequently

occur in the real world and are still a challenge for

federated learning algorithms (Shoham et al., 2019). A

longstanding issue with federated learning datasets is

that creating a federated non-IID dataset by artiﬁcially

partitioning datasets commonly used for machine

learning (e.g., CIFAR, MNIST) can create datasets

that do not follow a realistic non-IID distribution.

Due to this issue, (Caldas et al., 2018) created a

benchmark comprised of multiple federated datasets

and metrics. For example, their federated version

of MNIST (FEMNIST) is partitioned according

to the writer of the digit. In contrast, their dataset

Sentiment140, for sentiment analysis, is partitioned

based on the Twitter user. Furthermore, (He et al.,

2020) introduced FedML, a framework for federated

learning that includes default datasets for users to get

started. Finally, (Koh et al., 2021; Luo et al., 2019)

created datasets focused on real-world and in-the-wild

data.

Federated Learning Applications. Numerous

applications were proposed for algorithms using

federated learning. For example, in NLP, Google

has used federated learning to improve query

suggestions in the “Google Keyboard” while

preserving user privacy (Yang et al., 2018). Other

applications of federated learning were also in emoji

prediction (Ramaswamy et al., 2019), also by Google,

and speech recognition (Paulik et al., 2021), by

Apple. In computer vision, FedVision (Liu et al.,

2020) introduced a platform for object detection

using federated learning. Furthermore, one ﬁeld

that has been receiving attention in federated

learning is healthcare, where patient data is very

sensitive (Antunes et al., 2022). Finally, with the

expansion of edge devices and the internet of things,

federated learning is increasingly receiving attention

due to possible applications for training on edge

devices (Kontar et al., 2021)

Document Analysis. Document analysis is a

longstanding problem in computer vision with more

than two decades of research (Liu et al., 2021).

Although in the past decades, researchers focused their

work on traditional computer vision techniques, such

as image processing and pattern matching, to perform

their analysis (Love et al., 2013), more recently,

machine learning approaches have shown promising

results (Li et al., 2021). For example, (Kang et al.,

2014) interpreted document classiﬁcation as an image

classiﬁcation problem and used Convolutional Neural

Networks (CNNs) to perform the task. Researchers

aiming to stimulate progress in the area created

numerous new datasets, competitions, and benchmarks

for document analysis using data from many different

countries (Burie et al., 2015; Bulatovich et al.,

2022a; Chernyshova et al., 2021; Bulatovich et al.,

2022b; Polevoy et al., 2022). In this context,

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

552

(de S

a Soares et al., 2020) created a document

classiﬁcation based on Brazilian personal documents.

Document identiﬁcation is a fascinating case study

for federated learning since privacy is a signiﬁcant

concern among users because malicious users can use

the leaked information associated with identiﬁcation

details to perpetrate ﬁnancial fraud, false identity, and

many other crimes.

3 FedDocs

In this section, we explain our FedDocs system. To

do so, we will detail more about its architecture, the

process in which the clients are trained, and how the

server aggregates the weights.

3.1 Architecture of the System

Our application simulates a remote Federated Learning

(FL) architecture in different devices and networks.

The module comprises two applications: a desktop

client and a server app. Both are written in Python

and ran over the docker virtualization engine. The

docker image contains the Flower library (Beutel et al.,

2020), a general FL framework. Flower provides tools,

data structures, and protocols to perform federated

learning training across different devices, using the

gRPC (Google Remote Procedure Call) framework

(Marculescu, 2015).

Both applications (Client and Server modules)

are the primary tools of our architecture and can be

used in a real scenario to deploy the applications

on different devices. Moreover, the architecture

provides tools to adapt various applications or datasets

for supervised image classiﬁcation quickly. In our

case, more speciﬁcally, dealing with image document

classiﬁcation.

3.2 Client Training

The desktop client is responsible for locally training

the received model from the server, using the available

data, and sending back the updated parameters to

the server. We perform this training similarly to the

standard centralized setting, where each local data

sample contributes to the learning progress in every

epoch. The process ﬁnishes when it achieves a stop

criteria, such as the maximum number of local epochs

or a minimum loss value. In this paper, we use the

number of local epochs processed on each device with

their local dataset as the stop criteria.

After the training is concluded in all clients, the

trained model weights are sent back from the clients

to the server for the resulting aggregated model. Then

the server joins the knowledge and information of each

client neural network model into one global neural

network model.

3.3 Server Aggregation

The server coordinates the training process, aggregates

the client parameters (i.e., weights) in each round, and

retrieves the training metrics (i.e., accuracy, number

of samples). We compute the total accuracy by taking

the mean of the accuracy for each device on their local

test set. To implement this task, we use the strategy

abstraction in Flower, where each strategy provides

instructions to train and evaluate models on clients,

and perform the aggregationon the server.

As aforementioned, the connected clients send the

updated weights to the server for the aggregation phase.

Hence, no data leaves the local device, and only the

model parameters go through the network.

In each round (i.e., training iteration), the server

aggregates the local client model’s weights through an

aggregation method, in our case FedAvg (McMahan

et al., 2017). FedAvg makes a weighted average of the

local models and generates a new aggregated global

model. Then, this new aggregated global model is sent

back to the clients for evaluation. Finally, new clients

are selected for the next training cycle. The process

repeats until the model converges or achieves a stop

criteria.

4 FedBID

In this section, we describe our new dataset FedBID,

based on the BID Dataset (de S

a Soares et al., 2020).

4.1 Summary of BID Dataset

BID Dataset (Brazilian Identity Document Dataset)

is a dataset composed of images of the Brazilian

Driver’s License (a.k.a. CNH), Natural Person registry

document (a.k.a. CPF), and the Brazilian identiﬁcation

document (a.k.a. RG) that aims to help researchers

with numerous challenges of computer vision for

document automated processing, such as classiﬁcation

and segmentation, etc. We used this dataset since it

was generated using a process that anonymizes the

original publicly available data, creating a dataset that

complies with Brazilian data privacy laws (LGPD,

2018). Based on the original BID dataset, we manually

created a new set of labels and a distribution of

documents per various clients, as we proceed to detail

shortly.

FedBID and FedDocs: A Dataset and System for Federated Document Analysis

553

The setup of the original BID dataset allows a

classiﬁcation problem between the previously stated

document classes (CPF, CNH, and RG) and their

respective back and front counterparts. In this

manner, we have the following six classes: CPF Front,

CPF Back, CNH Front, CNH Back, RG Front, and

RG Back.

4.2 Distribution and Methodology

BID was initially conceived for the utilization of

segmentation ML systems. Each BID image has

a corresponding label with information from the

image segmentation. Furthermore, images of different

documents are categorized into different folders. This

setting, although sufﬁcient for traditional classiﬁcation

systems, lacks some information that may be relevant,

such as orientation, which can be important for some

applications, such as OCR systems.

Using the BID as a baseline, we further build

another system of classes. Since determining the

orientation of a document is essential for many OCR

systems to extract the text correctly, we added this type

of class to the BID dataset. To do this, we manually

annotate each document image with its orientation

(

◦

180

◦

270

◦

) using an interface from (Goecks

et al., 2021). Since we have four different orientations

with six classes of documents, we created a set of 24

categories by a cartesian product.

Furthermore, some of the documents on the BID

dataset contain information regarding the emission

issuer. To condense this information into a single label

ﬁle, we have built a script for automatically retrieving

this information from the segmentation labels. It works

by, for each image, matching its label ﬁle context with

a list containing all the document issuers in Brazil.

We can use the document issuer labels to simulate a

cross-silo federated data distribution. For example, in

Figure 2, we detail the distribution of the document

issuers found on the dataset. Details about the FedBID

dataset are shown in Table 1 and Figure 1. Separating

these issuers is interesting since, with this information,

we can perform a simulation of a siloed federated

learning setting.

Table 1: FedBID dataset description.

Split Samples Average Per Class

All Data 21600 3600 ± 0

Train 15120 2520 ± 12.5

Test 6480 1080 ± 12.5

Figure 1: The distribution for the 24 classes. The

FedBID distribution is unbalanced and non-IID. The classes

in this histogram are labeled in the following manner:

“(document class) (Front/Back) (angle)”.

Figure 2: Details on the nine different issuers present

on FedBID. Those documents that do not provide this

information are displayed in the unknown column.

4.3 Available Distributions

To verify different settings for federated algorithms,

we included different distributions of our data into

different distributions to evaluate many aspects

of federated training, such as initializing with a

pre-trained model on some categories. In this setting,

we use all 24 categories on each device, naming it a

balanced dataset. To evaluate algorithms, we built

three scenarios with 100, 500, and 1000 clients for

each distribution below:

•

Full Dataset: All samples from the entire dataset

are partitioned almost equally between the devices,

without repetition;

•

50% of train data for centralized training: We

separate 50% from the entire dataset to pre-train a

centralized model.

We also wanted to observe the effect of different

categories on each device. To do so, we change the

range of the number of categories in each device

for these distributions, randomly removing some

classes from each client and partitioning the training

dataset into 100 clients. We name this distribution a

unbalanced dataset.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

554

5 EXPERIMENTS

In this section, we propose a series of experiments

that aim to demonstrate how our dataset can be used to

benchmark a federated document classiﬁcation system,

in our case, the FedDocs system.

5.1 Training Setup

We use our system, FedDocs, with the FedAvg

aggregator for all experiments and the same training

local conﬁguration. The chosen model architecture is

the EfﬁcientNet B0 (Tan and Le, 2019). We resize the

documents’ image inputs to a resolution of 224x224

RGB pixels. The number of local epochs is set

E = 1

, and the batch size is

B = 16

. We use

the Adam (Kingma and Ba, 2014) optimizer with

learning-rate lr = 0.0005.

5.2 Results

In this section, we present the results of our

experiments with our federated learning tests. All

reported results are related to our test set.

5.2.1 Federated Training with a Balanced

Dataset

In Figure 3, we present the training results of our

solution in the BID dataset with no pre-trained model

for different numbers of clients and use the balanced

dataset distributed for each device. We evaluate the

model after training it for one epoch.

Figure 3: Federated results. Each curve represents a training

with a different number of total clients: (blue): 100 clients;

(red): 500 clients; (yellow): 1000 clients.

As presented in Figure 4, using a model pre-trained

on 50%, the results are similar even if we increase the

number of clients. Although, as we note, the federated

training does not seem to help gain extra accuracy

points, the training course appears to decrease the

accuracy of the global model slightly.

Figure 4: Federated training results starting from a

centralized model pre-trained with 50% of training data.

Each curve represents a training with a different number of

total clients: (blue): 100 clients; (red): 500 clients; (yellow):

1000 clients.

In this case, we can see some challenges in

our dataset that researchers could address with new

federated techniques, for example, a catastrophic

forgetting (Kirkpatrick et al., 2017) of a pre-trained

model as seen in the work of (Shoham et al.,

2019). Thus, FedBID could be considered a dataset

contribution for future Federated Learning research.

5.2.2 Federated Training with an Unbalanced

Dataset

Figure 5: Unbalanced federated results without pre-trained.

(blue): The classes vary between 18 and 24. (red): The

classes vary between 12 and 24

In Figures 5 and 6, we present the results of training

our model in the unbalanced dataset, with a varying

range of possible categories on each device. We

can see similar behavior to federated training in the

balanced setting. In the tests without the pre-trained

centralized models using all categories, it is noticeable

that the convergence is slower since the models only

reach more than 50% of accuracy near round 200,

while for the unbalanced setting, we obtained it around

round 100. Faster convergence is crucial since it saves

clients’ data transfer and battery consumption. The

difference between varying the range of categories

in each device is small, even with the reduction

FedBID and FedDocs: A Dataset and System for Federated Document Analysis

555

Figure 6: Unbalanced federated training results starting from

a centralized model pre-trained with 50% of training data.

(blue): The classes vary between 18 and 24. (red): The

classes vary between 12 and 24

of categories between them. However, increasing

this unbalance in categories between the clients can

make learning processing more challenging. These

scenarios can be helpful to test new methods to deal

with the convergence in unbalanced conﬁgurations,

for example, using the FedProx technique (Li et al.,

2020), since the local domain diverges from the global

domain.

In the tests with the pre-trained model (Figure 4),

we can observe a similar behavior as the experiments

with all categories, with the results showing a

tendency to the neural network models “forgetting”

the centralized knowledge, and, as a consequence,

don’t learning between the rounds. Therefore, we

can explore this scenario’s previous challenges in the

unbalanced scenario.

5.2.3 Federated Training with Silloed Data

In this test, our architecture uses the nine natural

partitions from FedBID according to the document

issuer, presented in Figure 2, where each one has

only one type of document, but with the document

style according to its owner, except the unknown

partition which has classes from many unknown users

and different document types. This setting is near

to a real scenario of federated training where the

data is non-IID. Our architecture results showed this

distribution’s impact on model convergence.

As shown in Figure 7, the model convergence does

not increase gradually as in the IID scenarios. On

the contrary, the curve appeared much fuzzier with

jumps between 6% to 51% of accuracy without a

clear indication of when the model will converge.

This situation is caused due to the difference again

between the local domains and the global, which, now,

for this dataset, is the most diverse due to the data

heterogeneity level.

On the other hand, in Figure 8, we tested the

same number of partitions from the last experiment

Figure 7: Non-idd federated training results using siloed

partition where each contains data from one exclusive issuer.

Figure 8: Federated training results using nine siloed

partitions containing balanced IID data from all issuers.

now with all data distributed in a balanced form, we

can see a now expected smooth convergence with a

gradually increasing accuracy. This result highlights

the challenge of dealing with a non-IID distribution

since this IID scenario is not possible in a real case due

to privacy, opening a new set of federated challenges

for document categorization.

6 CONCLUSION AND FUTURE

WORKS

Our experiments show that our new FedBID dataset

can be a benchmark for federated learning in Non-IID

tasks. To perform the tests, we created an application

named FedDocs to perform federated document

classiﬁcation. Our new federated dataset gives the

ﬁrst federated dataset for document classiﬁcation. For

the federated learning community, we also provide a

dataset with an attractive property, a natural federated

distribution based on issuers. As a result, researchers

could explore cross-silo labeling in future works. In

future work, we intend to construct federated learning

algorithms for non-IID distributions exploring our

dataset for future work.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

556

ACKNOWLEDGMENT

This paper was achieved in cooperation with HP

Inc. R&D Brazil, using incentives of the Brazilian

Informatics Law (n°. 8.2.48 of 1991). The authors

would like to thank Thomas da Silva Paula for being

the project conceiver and Eduardo Chagas for the

management support during this work.

REFERENCES

Antunes, R. S., Andr

e da Costa, C., K

uderle, A., Yari,

I. A., and Eskoﬁer, B. (2022). Federated learning

for healthcare: Systematic review and architecture

proposal. ACM Transactions on Intelligent Systems

and Technology (TIST).

Beutel, D. J., Topal, T., Mathur, A., Qiu, X., Parcollet, T.,

de Gusm

ao, P. P., and Lane, N. D. (2020). Flower: A

friendly federated learning research framework. arXiv

preprint arXiv:2007.14390.

Bulatovich, B. K., Vladimirovna, E. E., Vyacheslavovich,

T. D., Sergeevna, S. N., Sergeevna, C. Y., Zuheng, M.,

Jean-Christophe, B., and Muzzamil, L. M. (2022a).

Midv-2020: A comprehensive benchmark dataset

for identity document analysis. Computer Optics,

46(2):252–270.

Bulatovich, B. K., Vladimirovna, E. E., Vyacheslavovich,

T. D., Sergeevna, S. N., Sergeevna, C. Y., Zuheng, M.,

Jean-Christophe, B., and Muzzamil, L. M. (2022b).

Midv-2020: A comprehensive benchmark dataset for

identity document analysis. 46(2):252–270.

Burie, J.-C., Chazalon, J., Coustaty, M., Eskenazi, S.,

Luqman, M. M., Mehri, M., Nayef, N., Ogier,

J.-M., Prum, S., and Rusi

nol, M. (2015). Icdar2015

competition on smartphone document capture and ocr

(smartdoc). In 2015 13th International Conference on

Document Analysis and Recognition (ICDAR), pages

1161–1165. IEEE.

Caldas, S., Duddu, S. M. K., Wu, P., Li, T., Kone

y, J.,

McMahan, H. B., Smith, V., and Talwalkar, A. (2018).

Leaf: A benchmark for federated settings. arXiv

preprint arXiv:1812.01097.

Chernyshova, Y., Emelianova, E., Sheshkus, A., and

Arlazarov, V. V. (2021). Midv-lait: a challenging

dataset for recognition of ids with perso-arabic, thai,

and indian scripts. In International Conference on

Document Analysis and Recognition, pages 258–272.

Springer.

de S

a Soares, A., das Neves Junior, R. B., and Bezerra,

B. L. D. (2020). Bid dataset: a challenge dataset for

document processing tasks. In Anais Estendidos do

XXXIII Conference on Graphics, Patterns and Images,

pages 143–146. SBC.

Goecks, V. G., Waytowich, N., Watkins, D., and Prakash, B.

(2021). Combining learning from human feedback and

knowledge engineering to solve hierarchical tasks in

minecraft. arXiv preprint arXiv:2112.03482.

He, C., Li, S., So, J., Zeng, X., Zhang, M., Wang, H., Wang,

X., Vepakomma, P., Singh, A., Qiu, H., et al. (2020).

Fedml: A research library and benchmark for federated

machine learning. arXiv preprint arXiv:2007.13518.

Huang, Y., Ma, X., Fan, X., Liu, J., and Gong, W. (2017).

When deep learning meets edge computing. In

2017 IEEE 25th international conference on network

protocols (ICNP), pages 1–2. IEEE.

Kairouz, P., McMahan, H. B., Avent, B., Bellet, A.,

Bennis, M., Bhagoji, A. N., Bonawitz, K., Charles, Z.,

Cormode, G., Cummings, R., et al. (2021). Advances

and open problems in federated learning. Foundations

and Trends® in Machine Learning, 14(1–2):1–210.

Kang, L., Kumar, J., Ye, P., Li, Y., and Doermann, D. (2014).

Convolutional neural networks for document image

classiﬁcation. In 2014 22nd International Conference

on Pattern Recognition, pages 3168–3172. IEEE.

Kingma, D. P. and Ba, J. (2014). Adam: A

method for stochastic optimization. arXiv preprint

arXiv:1412.6980.

Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J.,

Desjardins, G., Rusu, A. A., Milan, K., Quan, J.,

Ramalho, T., Grabska-Barwinska, A., et al. (2017).

Overcoming catastrophic forgetting in neural networks.

Proceedings of the national academy of sciences,

114(13):3521–3526.

Koh, P. W., Sagawa, S., Marklund, H., Xie, S. M., Zhang,

M., Balsubramani, A., Hu, W., Yasunaga, M., Phillips,

R. L., Gao, I., et al. (2021). Wilds: A benchmark

of in-the-wild distribution shifts. In International

Conference on Machine Learning.

Kontar, R., Shi, N., Yue, X., Chung, S., Byon, E.,

Chowdhury, M., Jin, J., Kontar, W., Masoud, N.,

Nouiehed, M., et al. (2021). The internet of federated

things (ioft). IEEE Access.

LGPD (2018). General law for personal data protection

(lgpd). http://www.planalto.gov.br/ccivil 03/

ato2015-2018/2018/lei/L13709.htm. Accessed:

2022-09-19.

Li, Q., Wen, Z., Wu, Z., Hu, S., Wang, N., Li, Y., Liu, X.,

and He, B. (2021). A survey on federated learning

systems: vision, hype and reality for data privacy and

protection. IEEE Transactions on Knowledge and Data

Engineering.

Li, T., Sahu, A. K., Talwalkar, A., and Smith, V. (2020).

Federated learning: Challenges, methods, and future

directions. IEEE Signal Processing Magazine.

Li, W., Milletar

ı, F., Xu, D., Rieke, N., Hancox, J., Zhu, W.,

Baust, M., Cheng, Y., Ourselin, S., Cardoso, M. J., et al.

(2019). Privacy-preserving federated brain tumour

segmentation. In International workshop on machine

learning in medical imaging.

Liu, L., Wang, Z., Qiu, T., Chen, Q., Lu, Y., and Suen, C. Y.

(2021). Document image classiﬁcation: Progress over

two decades. Neurocomputing, 453:223–240.

Liu, Y., Huang, A., Luo, Y., Huang, H., Liu, Y., Chen,

Y., Feng, L., Chen, T., Yu, H., and Yang, Q. (2020).

Fedvision: An online visual object detection platform

AAAI Conference on Artiﬁcial Intelligence.

FedBID and FedDocs: A Dataset and System for Federated Document Analysis

557

Love, P. et al. (2013). Document analysis. In Research in

the college context, pages 99–112. Routledge.

Luo, J., Wu, X., Luo, Y., Huang, A., Huang, Y., Liu, Y.,

and Yang, Q. (2019). Real-world image datasets for

federated learning. arXiv preprint arXiv:1910.11089.

Marculescu, M. (2015). Introducing grpc, a

new open source http/2 rpc framework.

https://developers.googleblog.com/2015/02/

introducing-grpc-new-open-source-http2.html.

Mayr, A., Kißkalt, D., Meiners, M., Lutz, B., Sch

afer,

F., Seidel, R., Selmaier, A., Fuchs, J., Metzner,

M., Blank, A., et al. (2019). Machine learning

in production–potentials, challenges and exemplary

applications. Procedia CIRP, 86:49–54.

McMahan, B., Moore, E., Ramage, D., Hampson, S., and

y Arcas, B. A. (2017). Communication-efﬁcient

learning of deep networks from decentralized data. In

Artiﬁcial intelligence and statistics, pages 1273–1282.

PMLR.

Paulik, M., Seigel, M., Mason, H., Telaar, D., Kluivers, J.,

van Dalen, R., Lau, C. W., Carlson, L., Granqvist, F.,

Vandevelde, C., et al. (2021). Federated evaluation and

tuning for on-device personalization: System design &

applications. arXiv preprint arXiv:2102.08503.

Polevoy, D. V., Sigareva, I. V., Ershova, D. M., Arlazarov,

V. V., Nikolaev, D. P., Ming, Z., Luqman, M. M.,

and Burie, J.-C. (2022). Document liveness challenge

dataset (dlc-2021). Journal of Imaging, 8(7):181.

Ramaswamy, S., Mathews, R., Rao, K., and Beaufays, F.

(2019). Federated learning for emoji prediction in a

mobile keyboard. arXiv preprint arXiv:1906.04329.

Shoham, N., Avidor, T., Keren, A., Israel, N., Benditkis,

D., Mor-Yosef, L., and Zeitak, I. (2019). Overcoming

forgetting in federated learning on non-iid data. arXiv

preprint arXiv:1910.07796.

underhauf, N., Brock, O., Scheirer, W., Hadsell, R., Fox,

D., Leitner, J., Upcroft, B., Abbeel, P., Burgard, W.,

Milford, M., et al. (2018). The limits and potentials of

deep learning for robotics. The International journal

of robotics research, 37(4-5):405–420.

Tan, M. and Le, Q. (2019). Efﬁcientnet: Rethinking

model scaling for convolutional neural networks. In

International conference on machine learning, pages

6105–6114. PMLR.

Xu, M., Liu, J., Liu, Y., Lin, F. X., Liu, Y., and Liu, X. (2019).

A ﬁrst look at deep learning apps on smartphones. In

The World Wide Web Conference, pages 2125–2136.

Yang, T., Andrew, G., Eichner, H., Sun, H., Li, W., Kong,

N., Ramage, D., and Beaufays, F. (2018). Applied

federated learning: Improving google keyboard query

suggestions. arXiv preprint arXiv:1812.02903.

Zhu, Z., Hong, J., and Zhou, J. (2021). Data-free knowledge

distillation for heterogeneous federated learning. In

International Conference on Machine Learning, pages

12878–12889. PMLR.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

558