Multi-Class Categorization of Three-Dimensional (3-D) Objects for

Digital Holographic Information Using Deep Learning

Uma Mahesh R N

and Yogesh N

Dept of CSE (AI&ML), ATME College of Engineering, Mysore, Karnataka, India

Dept of CSD (Computer Science and Design), ATME College of Engineering, Mysore, Karnataka, India

Keywords: Deep Learning, Receiver Operating Characteristic (ROC), 3-D Objects Categorization, Phase-Shifting

Digital Holography (PSDH).

Abstract: In this paper, n-class (n=3) categorization of three-dimensional (3-D) objects using digital holographic data

has been achieved with a deep learning network. For n=3 categories, the 3-D object “triangle-square” is

assigned to category 1, the 3-D object “circle-square” to category 2, and the 3-D objects “triangle-circle”

and “square-triangle” are grouped into category 3. The dataset, comprising phase-only images derived from

digital holographic data, was generated using the phase-shifting digital holography (PSDH) technique. It

includes 2880 images created through the application of a rotation invariance method. The deep learning

network was trained on the dataset to generate the output. The results, including the n-class (n=3) error

matrix, receiver operating characteristic (ROC), and positive predictive value (PPV)–true positive rate

(TPR) characteristic are presented to validate the work.

1 INTRODUCTION

Digital holography is a three-dimensional (3D)

imaging technique that captures digital holograms of

3D objects using charge-coupled device (CCD) or

complementary metal-oxide-semiconductor (CMOS)

sensors. The recorded digital hologram can be

numerically processed using the phase-shifting

digital holography (PSDH) technique to obtain a

complex-valued image containing both intensity and

phase information. The phase-only digital

holographic information derived from PSDH was

subsequently utilized for deep learning-based

applications, including categorization and prediction

tasks. Deep learning, a branch of artificial

intelligence, encompasses various deep neural

networks, such as multi-layer perceptron (MLP),

convolutional neural network (CNN), long short-

term memory (LSTM) model, Alex Net, and

generative adversarial networks (GANs). These

networks have been applied to numerous deep

learning-based digital holographic applications,

including single-pixel imaging (Mizutani, Kataoka,

et al. , 2024), quantitative phase imaging (Butola,

Hellberg, et al. , 2024), fast particle characterization

(Schneider, Dambre, et al., 2015), hologram

generation (Kang, Park, et al. , 2021), and

categorization and prediction of 3D

objects(Basavaraju, 2024), (Reddy, Mahesh, et al. ,

2022), (Mahesh, Reddy, et al. , 2022), (U. M. R N,

and, K. B, 2024), (Mahesh, R.N.U., et al. , 2022),

(Mahesh, R.N.U., et al. , 2023). A CNN is a deep

neural network comprising multiple stages of

convolutional and pooling layers for feature

extraction, followed by dense and output layers in

the classification stage. The feature extraction layer

process the input data, and their output is passed to

the classification layer to perform the n-class (n=3)

categorization task. In the classification stage, the

dense layer receives input from the final pooling

layer and generates an intermediate output, which is

then passed to the output layer to produce the final

result. Categorization, a supervised machine learning

technique, determines the decision boundary

between the input features and the target labels. The

categorization output provides discrete labels as the

final result. Lam et al. (Lam, H.H., et al. , 2019)

performed hologram categorization of deformable

objects using a deep CNN, while Kim et al. (Kim,

Wang, et al. , 2018) conducted hologram

categorization of microbeads employing deep

learning technique. Additionally, Pitkäaho et al.

(Pitkäaho, Manninen, et al. , 2018) categorized

384

Mahesh R N, U. and N, Y.

Multi-Class Categorization of Three-Dimensional (3-D) Objects for Digital Holographic Information Using Deep Learning.

DOI: 10.5220/0013592800004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 2, pages 384-388

ISBN: 978-989-758-763-4

phase-only cancer cell images using a deep learning

technique. In this paper, n-class (n=3) categorization

of digital holographic data for 3-D objects is

performed using a deep learning network. For n=3

categories, the 3-D object “triangle-square” is

assigned to category 1, “circle-square” to category 2,

and “triangle-circle” and “square-triangle” are

grouped into category 3. The primary distinction of

this work from previous studies lies in its focus on n-

class (n=3) categorization of 3-D objects using a

deep learning network. The dataset, comprising

phase-only images derived from digital holographic

data, was generated using the PSDH technique. This

dataset includes 2880 images produced through a

rotation invariance method. The deep learning

network was trained on this dataset to generate the

output. Results, including the n-class (n=3) error

matrix, receiver operating characteristic (ROC), and

positive predictive value (PPV)–true positive rate

(TPR) characteristic are presented to validate the

effectiveness of the proposed approach.

2 DESIGN AND PRINCIPLE OF

OPERATION

2.1 METHODOLOGY

Figure 1. Architecture of Deep CNN for three-class

categorization

Figure 1 shows the architecture of deep CNN used

for the n-class (n=3) categorization task. The CNN

takes the input as digital holographic image of size

160 × 160. The feature extraction layer has four

consecutive convolutional, pooling layers. The

convolutional layer expression is given by

𝑍



(



)

=𝑓(

∑∑

ℎ



()

𝑋



+𝐵



)









…...(1)

In the above eqn. (1), 𝑍



(



)

represents the output

and 𝑋



represents the input. ℎ



()

represents kernel

coefficients, 𝑛 represents the number of kernels, 𝑠

represents the kernel size, 𝑓 represents the activation

function, and 𝐵



represents bias(Mahesh, Nelleri, et

al. , 2023). The value of 𝑛 is varied 𝑛 = 8,16,32,64.

The value of 𝑠 is 𝑠=3×3. The activation function

𝑓 represents the Rectified Linear Unit (ReLU)

activation function, which is used in both

convolutional and dense layers. Next, the pooling

technique is employed consisting of Max-

Pooling2D. The expression for pooling layer is given

𝑍



=𝑌



………………..(2)

In the above eqn. (2), 𝑍



represents the output

and 𝑌



represents the input (Mahesh, Nelleri, et al. ,

2023). The output of the final pooling layer is

flattened and passed to the dense layer. The

expression for the dense layer is then given by

𝑍



=𝑓(

∑

𝑊



𝑋



+𝐵



)





… ….(3)

In the above eqn. (3), 𝑍



represents the output

and 𝐵



represents the bias, 𝑓 represents the ReLU

activation function, 𝑊



represents weight values,

𝑋



represents the one dimensional (1-D) data

obtained through the flatten layer, and 𝑞 represents

the number of neurons. The output of the final

pooling layer is 8×8×64. The value of 𝑞 is 𝑞=

16. The output of the dense layer is fed into the

output layer. For n-class (n=3) categorization, the

output layer comprises three neurons along with a

softmax activation function to produce the output.

The equation for the softmax activation function is

given by

𝑍



 (



)

∑















……………(4)

In the above eqn. (4), where 𝑍



represents the

output, 𝑌



represents the input, and N represents the

number of neurons.

3 DATASET PREPARATION

WITH SIMULATION RESULTS

AND DISCUSSION

For n-class (n=3) categorization, the 3D object

“triangle-square” is assigned to category 1, “circle-

square” to category 2, and both “triangle-circle” and

“square-triangle” are grouped into category 3. The

3D object “circle-square” is designed such that the

circle feature is positioned in the front plane, while

the square feature is located in the back plane. Each

plane is separated by various distances 𝑑



, and 𝑑



respectively. The remaining three 3D objects were

constructed in a similar manner, with different

features positioned in the front and back planes,

Multi-Class Categorization of Three-Dimensional (3-D) Objects for Digital Holographic Information Using Deep Learning

385

respectively. Four phase-shifted holograms of all

four 3-D objects were formed at 0°, 90°, 180°, and

270° at the camera plane and these holograms were

post-processed to obtain complex-valued image

containing intensity and phase information using a

four-step PSDH technique. The holograms and

reconstructed intensity/phase images of all four 3-D

objects are of size 1024 × 1024. The reconstructed

intensity and phase images were generated at both

distances. Figure 2 illustrates the schematic of the

3D object “triangle-circle” which belongs to

category 3. Additionally, Figure 2 presents the

geometry for digital hologram recording using four-

step phase-shifted plane reference waves.

(a)

Figure 2. schematic of the geometry for the recording of

the digital hologram of 3-D object volume with different

features

in the front and back planes and separating

distances z=5 cm and d=1 cm. (a) triangle-circle. BS :

beam splitter CCD : charge coupled device.

Digital holograms and the reconstructed intensity

and phase images of four different 3D objects were

rotated incrementally in steps of 0.5° resulting in a

dataset of 2,880 images for each type. The dataset

was prepared in MATLAB. For the n-class (n=3)

categorization of 3D objects, only phase information

was utilized. The dataset of 2,880 phase images was

divided into training, validation, and test sets

comprising 2,160 images (75%), 432 images (15%),

and 288 images (10%) respectively. For the training

of the deep learning network, the size of the phase

image considered was 160 × 160 from 1024 ×

1024. The deep learning network was implemented

in a TensorFlow environment using python

programming. A sample of a reconstructed phase

image of a 3D object, specifically the “triangle-

circle” belonging to category 3 is shown in Figure 3.

The deep learning network was tested on a batch

of 24 images from the test set. The n-class (n=3)

error matrix generated by the deep learning network

is presented in Figure 4.

(a)

Figure 3. reconstructed phase-only image of 3-D object (a)

triangle-circle.

Figure 4. Three-class confusion matrix from phase-only

image dataset.

From Fig. 4, it is evident that the error matrix

represents the categorization results for n=3

categories. Additionally, the ROC and the PPV-TPR

INCOFT 2025 - International Conference on Futuristic Technology

386

characteristic derived from the deep learning

network are displayed in Figure 5.

(a)

(b)

Figure 5. a) receiver operating characteristic (ROC). b)

positive predictive value (PPV)-true positive rate (TPR)

characteristic.

From Figure 5 (a), it can be said that the deep

learning network has a higher area under curve

(AUC) value for category 1 compared to other

categories. Similarly, from Figure 5 (b), it can be

said that the deep learning network has lower PPV as

the TPR approaches higher for categories 1, and 2

whereas, for category 3, the deep learning network

has higher PPV compared to the other two

categories.

4 CONCLUSIONS

This paper presents the n-class (n=3) categorization

of three-dimensional (3D) objects using phase-only

digital holographic data with a deep learning

network. For the three categories, the 3D object

“triangle-square” is assigned to category 1, “circle-

square” to category 2, and “triangle-circle” and

“square-triangle” are grouped into category 3. The

dataset, consisting of phase-only images obtained

from digital holographic data, was generated using

the PSDH technique. It comprises 2,880 images

created through a rotation invariance method. The

deep learning network was trained on this dataset to

produce categorization results. The results, including

the n-class (n=3) error matrix, ROC, and PPV–TPR

characteristic validate the approach. The error matrix

reveals a higher number of images categorized as

FALSE compared to TRUE for categories 1, and 2

compared to category 3. For category 3, the error

matrix has higher number of images for TRUE

compared to FALSE. Additionally, the ROC analysis

indicates that the AUC is highest for category 1

compared to the other two categories. These findings

demonstrate that deep learning network is a suitable

method for n-class (n=3) categorization of 3D

objects using phase-only digital holographic data.

REFERENCES

Mizutani, Y., Kataoka, S., Uenohara, T., Takaya, Y. and

Matoba, O., 2024, July. “Machine learning assisted

single pixel imaging for weak light detection”, In 3D

Image Acquisition and Display: Technology,

Perception and Applications (pp. DW3H-4). Optica

Publishing Group.

ttps://doi.org/10.1364/3D.2024.DW3H.4

Butola, A., Hellberg, S., Nystad, M. and Agarwal, K.,

2024, July. “Quantitative phase imaging and machine

learning for spermatozoa analysis”, In Imaging

Systems and Applications (pp. JM4A-6). Optica

Publishing Group.

https://doi.org/10.1364/3D.2024.JM4A.6

Schneider, B., Dambre, J. and Bienstman, P., 2015. “Fast

particle characterization using digital holography and

neural networks”, Applied optics, 55(1), pp.133-139.

https://doi.org/10.1364/AO.55.000133

Kang, J.W., Park, B.S., Kim, J.K., Kim, D.W. and Seo,

Y.H., 2021. “Deep-learning-based hologram

generation using a generative model”, Applied

Optics, 60(24), pp.7391-7399.

https://doi.org/10.1364/AO.427262

Mahesh R N, U.; Nelleri, A. “Multi-Class Classification

and Multi-Output Regression of Three-Dimensional

Objects Using Artificial Intelligence Applied to Digital

Multi-Class Categorization of Three-Dimensional (3-D) Objects for Digital Holographic Information Using Deep Learning

387

Holographic Information”, Sensors 2023, 23, 1095.

https://doi.org/10.3390/s23031095

RN UM, Basavaraju L. Deep Learning-based Multi-class

Three-dimensional (3-D) Object Classification using

Phase-only Digital Holographic Information. IgMin

Res. Jul 09, 2024; 2(7): 550-557. IgMin ID: igmin216;

DOI:10.61927/igmin216; Available at:

igmin.link/p216

Reddy, B.L., Uma Mahesh, R.N. and Nelleri, A., 2022.

“Deep convolutional neural network for three-

dimensional objects classification using off-axis digital

Fresnel holography”, Journal of Modern

Optics, 69(13), pp.705-717.

https://doi.org/10.1080/09500340.2022.2081371

Uma Mahesh, R.N., Lokesh Reddy, B., Nelleri, A. (2022).

Deep Learning-Based Multi-class 3D Objects

Classification Using Digital Holographic Complex

Images. In: Sivasubramanian, A., Shastry, P.N., Hong,

P.C. (eds) Futuristic Communication and Network

Technologies. VICFCNT 2020. Lecture Notes in

Electrical Engineering, vol 792. Springer, Singapore.

https://doi.org/10.1007/978-981-16-4625-6_43

U. M. R N and K. B, “Three-dimensional (3-D) objects

classification by means of phase-only digital

holographic information using Alex Network”, 2024

International Conference on Signal Processing,

Computation, Electronics, Power and

Telecommunication (IConSCEPT), Karaikal, India,

2024, pp. 1-5, doi:

10.1109/IConSCEPT61884.2024.10627906.

Mahesh, R.N.U., Nelleri, A. “Deep convolutional neural

network for binary regression of three-dimensional

objects using information retrieved from digital

Fresnel holograms”, Appl. Phys. B 128, 157 (2022).

https://doi.org/10.1007/s00340-022-07877-w

Mahesh, R.N.U., Nelleri, A. (2023). “Machine Learning-

Based Binary Regression Task of 3D Objects in

Digital Holography”, In: Subhashini, N., Ezra,

M.A.G., Liaw, SK. (eds) Futuristic Communication

and Network Technologies. VICFCNT 2021. Lecture

Notes in Electrical Engineering, vol 995. Springer,

Singapore. https://doi.org/10.1007/978-981-19-9748-

8_34

Lam, H.H., Tsang, P.W.M. and Poon, T.C., 2019.

“Ensemble convolutional neural network for

classifying holograms of deformable objects”, Optics

Express, 27(23), pp.34050-34055.

https://doi.org/10.1364/OE.27.034050

Kim, S.J., Wang, C., Zhao, B., Im, H., Min, J., Choi, H.J.,

Tadros, J., Choi, N.R., Castro, C.M., Weissleder, R.

and Lee, H., 2018. “Deep transfer learning-based

hologram classification for molecular diagnostics”,

Scientific reports, 8(1), p.17003. doi: 10.1038/s41598-

018-35274-x.

Pitkäaho, T., Manninen, A. and Naughton, T.J., 2018,

June. “Classification of digital holograms with deep

learning and hand-crafted features”, In Digital

Holography and Three-Dimensional Imaging (pp.

DW2F-3). Optica Publishing Group.

https://doi.org/10.1364/DH.2018.DW2F.3

Trieu, Q. and Nehmetallah, G., 2024. “Deep learning

based coherence holography reconstruction of 3D

objects”, Applied Optics, 63(7), pp.B1-B15.

https://doi.org/10.1364/AO.503034

Tahara, T., 2024. “Incoherent digital holography with two

polarization-sensitive phase-only spatial light

modulators and reduced number of exposures”,

Applied Optics, 63(7), pp.B24-B31.

https://doi.org/10.1364/AO.505624

Störk, T., Seyler, T., Fratz, M., Bertz, A., Hensel, S. and

Carl, D., 2024. “Detecting vibrations in digital

holographic multiwavelength measurements using

deep learning”, Applied Optics, 63(7), pp.B32-B41.

https://doi.org/10.1364/AO.507303

INCOFT 2025 - International Conference on Futuristic Technology

388