XYZ Unsupervised Network: A Robust Image Dehazing Approach

Percy Maldonado-Quispe

and Helio Pedrini

Institute of Computing, University of Campinas, Campinas, SP, 13083-852, Brazil

Keywords:

Image Dehazing, Unsupervised Network, Zero-Shot Learning, Disentanglement, Dark Channel Prior,

Atmospheric Scattering Model.

Abstract:

In this work, we examine a major less-explored topic in image dehazing neural networks, speciﬁcally how to

remove haze (natural phenomenon) in an unsupervised manner from a given image. By considering a hazy

image as the entanglement of many “simpler” layers, such as a hazy-free image layer, transmission map layer,

and atmospheric light layer, as shown in the atmospheric scattering model, we propose a method based on the

concept of layer disentanglement. Our XYZ approach presents improvements in the SSIM and PSNR metrics,

this being the combination of the XHOT, YOLY and ZID methods, in which the advantages of each of them are

maintained. The main beneﬁts of the proposed XYZ are twofold. First, since it is an unsupervised approach,

no clean photos, including hazy-clear pairs, are used as the ground truth. In other words, it differs from the

traditional paradigm of deep model training on a large dataset. The second is to consider haze issues as being

composed of several layers.

1 INTRODUCTION

In bad weather conditions, such as fog and haze, im-

age quality is greatly degraded due to the inﬂuence of

particles in the atmosphere. The suspended particles

will scatter the light and dim the reﬂected light from

the scene, and the scattered atmospheric light will also

mix with the light received by the camera and change

the contrast and color of the image.

The quality of the images captured by the camera

is reduced due to absorption by particles ﬂoating in

the environment. The phenomenon of degraded im-

age quality on foggy days negatively affects photo-

graphic work. The contrast of the image will be re-

duced and the colors will change. At the same time,

the textures and boundaries of objects in the scene be-

come blurred. For vision tasks such as object detec-

tion and image segmentation, poor-quality inputs can

degrade the performance of well-designed models.

To estimate the global atmospheric light and trans-

mission coefﬁcient per pixel from the atmospheric

scattering model, a variety of methods have been pro-

posed, which could be roughly divided into previous

assumption-based methods and learning-based meth-

ods.

Previous methods rely mainly on some back-

https://orcid.org/0009-0007-6391-5370

https://orcid.org/0000-0003-0125-630X

ground derived from the image. For example, Tan

(2008) proposed to remove haze by maximizing the

local contrast of the image, based on the assump-

tion that clean images tend to have higher contrast.

Berman et al. (2016) proposed to remove haze based

on the assumption that the colors of an image without

haze could well approximate a few hundred colors.

Although remarkable performance has been achieved

with these methods, the quality of haze removal is

highly dependent on the consistency between the data

and actual distribution.

To avoid previous assumptions, several authors

have devoted much effort to the design of methods

based on deep neural networks. This method changes,

detects and removes haze from an image by directly

learning atmospheric dispersion parameters from the

training data. For example, Cai et al. (2016) proposed

a convolutional neural network that requires a large

set of pairs of clean and haze images, so they are su-

pervised learning methods.

Although supervised learning methods (Qin et al.,

2020; Liu et al., 2019; Dong et al., 2020) have

achieved high performance in haze removal, they

have several limitations, one of which is that they

require large-scale clean and fuzzy image pairs to

train their models, and such a requirement is usu-

ally satisﬁed by artiﬁcially synthesizing fuzzy images,

through the physical model with the parameters made

500

Maldonado-Quispe, P. and Pedrini, H.

XYZ Unsupervised Network: A Robust Image Dehazing Approach.

DOI: 10.5220/0012383400003660

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2024) - Volume 3: VISAPP, pages

500-507

ISBN: 978-989-758-679-8; ISSN: 2184-4321

by hand and the clean image. As pointed out by Golts

et al. (2020), synthesized databases are less informa-

tive and inconsistent than the real ones. Therefore, it

is expected that unsupervised models will be devel-

oped.

However, in practice, due to variations in the scene

and other factors such as illumination, it is difﬁcult, if

not impossible, to collect a large scale dataset with the

desired ground truth. As a result, most methods rely

on ﬁrst collecting some clean images and then syn-

thesizing the corresponding hazy images using an at-

mospheric scattering model with handcrafted param-

eters. However, when the model trained on the syn-

thetic dataset is applied to real-world haze images, the

domain shift problem will arise, as the synthetic haze

images are likely to be less informative and inconsis-

tent with the real-world haze images. For this reason,

we propose our method called XYZ, which tries to

overcome the problems mentioned above. As far as

we know, this task was not widely addressed.

As main contributions, we have that our method

does not need paired images, that is, we do not have

the dependence of having a large data set with im-

ages with haze and without haze, which was already

a problem in itself. In this work, we propose two ap-

proaches in their totality, the ﬁrst of them is to obtain

a simple and lightweight unsupervised neural network

(XHOT) for the removal of haze in order to lighten the

processing time, on the other hand we have the exist-

ing neural network clustering approach for haze re-

moval. XYZ (ZHOT, YOLY and ZID) seeks an inte-

gration of the advantages of these methods present in

the state of the art, assuming that each method focuses

on different aspects such as: Dark Channel Prior ap-

proximation, HSV Color Space and the use of a deep

neural network such as U-Net.

2 RELATED WORK

As a result, many researchers are trying to recover

clean, high-quality scenes from haze images. Be-

fore the widespread use of deep learning in com-

puter vision tasks, image haze removal algorithms

were mainly based on some earlier assumptions (He

et al., 2009) and the Atmospheric Scattering Model

(ASM) (McCartney, 1978). The processing se-

quences of these rule-based statistical methods are

easy to interpret. However, they can fail when faced

with complex real-world scenarios. For example, the

popular dark channel prior (He et al., 2009) (DCP)

does not handle empty regions well.

The works (Cai et al., 2016; Liu et al., 2019;

Zheng et al., 2021) are inspired by deep learning and

combined ASM with convolutional neural networks

(CNN) to estimate physical parameters. Quantitative

and qualitative experimental results show that deep

learning can help predict physical parameters in a su-

pervised manner. Wang et al. (2022a) on the other

hand proposes the use of an attention-convolutional

module.

Following this, Liu et al. (2019) and Zheng et al.

(2021) have demonstrated that end-to-end supervised

dehazing networks can be implemented indepen-

dently of the ASM. Thanks to the powerful feature

extraction capability of CNN, these non-ASM-based

dehazing algorithms can achieve similar accuracy as

ASM-based algorithms.

ASM-based and non-ASM-based supervised al-

gorithms have shown impressive performance. How-

ever, they often require synthetic paired images that

are inconsistent with real-world hazy images. There-

fore, recent research focus on methods that are more

suitable to the real world dehazing task. Engin et al.

(2018), Li et al. (2020), Golts et al. (2020) and Li et al.

(2021) explored unsupervised algorithms that do not

require synthetic data.

In addition, Chen et al. (2022) designed a method

based on two stages, which eliminates haze using

DCP and subsequently optimizes the results using ex-

isting features between the transmission and depth

map. Wang et al. (2022b) proposes a method com-

pletely independent of real haze-free images. How-

ever, GAN-based networks have great complexity and

runtime.

Data-driven unsupervised dehazing methods have

achieved impressive performances. Unlike models

that require sufﬁcient data to perform network train-

ing, Li et al. (2020) proposed a neural network de-

hazing process that only requires a single example.

They further reduced the dependence of the parameter

learning process on data by combining the advantages

of unsupervised learning and zero-shot learning.

Methods for unsupervised dehazing that are data-

driven have demonstrated outstanding results. In the

approaches developed by Li et al. (2020) and Li et al.

(2021), a dehazing based on neural networks uses a

single example, in contrast to methods that require

sufﬁcient data to perform network training. By com-

bining the beneﬁts of unsupervised learning and zero-

shot learning, the authors further decreased the depen-

dence of the parameter learning process on data.

Supervised methods of dehazing have achieved

excellent results. However, this requires paired data,

which is difﬁcult to obtain in the real world. For out-

door scenes with moving objects, it is difﬁcult to guar-

antee that two images taken on a clear, cloudy day

have exactly the same content. This is why other ap-

XYZ Unsupervised Network: A Robust Image Dehazing Approach

501

proaches (Engin et al., 2018; Li et al., 2020; Golts

et al., 2020; Li et al., 2021) have explored unsuper-

vised haze removal algorithms.

3 UNSUPERVISED APPROACH

In this section, we present the development of our

proposed approaches, which combines three unsuper-

vised learning methods. First, there is XHOT 3.1.1,

which is based on the fundamentals presented in pre-

vious investigations, speciﬁcally in the works of (Li

et al., 2021) and (Golts et al., 2020). Then, we ex-

plore YOLY, based on research by Li et al. (2021), and

ZID, based on research by Li et al. (2020). Both meth-

ods demonstrated outstanding performance in metrics

such as SSIM and PSNR, establishing themselves as

state-of-the-art benchmarks in the RESIDE dataset.

In addition, it is important to highlight that all the

methods described as follows do not require a set of

paired images for your training, as they address the

removal of haze individually for each image.

3.1 Methods

In the next subsections, we present the methods used

in this work, which include XHOT, YOLY and ZID.

3.1.1 XHOT

The XHOT network arises from the need to develop

an effective and efﬁcient haze removal solution with-

out relying on paired data, as required in unsupervised

learning. Recognizing that haze removal can be bro-

ken down into simpler components, we base our strat-

egy on the combination of multiple simple layers, as

discussed in previous research (Li et al., 2020).

To address this challenge, we create three neu-

ral networks, two of which are convolutional neu-

ral networks designed to estimate the best values for

J(x) and T(x), respectively, as shown in Figure 1.

However, in the case of calculating atmospheric light,

an independent component of the image content, we

chose to maintain the methodology proposed by pre-

vious research, speciﬁcally by Li et al. (2020) and Li

et al. (2021), who employed a Variational Autoen-

coder for estimating atmospheric light.

Architectures. Three sub-networks were con-

structed to estimate the values for J(x), T (x) and

• J-Net: This sub-network consists of a non de-

generative convolutional neural network, which

means that it does not alter the dimensions of

J(x)

T(x)

A(x)

ASM

Normalization

LeakyRelu

Sigmoid

Conv

MaxPool

Upsample

I(x)

Figure 1: Architecture of our proposed XHOT, in which

we have three sub-networks to calculate the variables J(x),

T (x) and A, respectively.

the input, which is composed of three convolu-

tional blocks, each one performing a convolution

with a kernel size 5, followed by normalization by

batches and ﬁnally a LeakyRelu activation func-

tion with a slope of 0.01. At the end of the third

block, we applied a convolution together with a

Sigmoid activation function, to normalize the out-

put between 0 and 1. The output of this sub-

network is an image with 3 channels that is the

haze free image J(x) and that later will be in

charge of guiding the training.

• T-Net: This sub-network uses the same neural

network as J-Net, however presents a difference

with respect to the outcome of the neural network

which is a single-channel image, representing the

transmission map T (x) within the image.

• A-Net: This sub-network consists of a Variational

Autoencoder, since the variable A is not related to

the content of the image, similar to the work by Li

et al. (2020), it is assumed that A is sampled from

a latent Gaussian distribution, and so the prob-

lem becomes a variational inference (Kingma and

Welling, 2014).

Loss Function. To train our unsupervised model,

we employ a loss function that combines the loss

functions of J-Net and A-Net, as expressed in Equa-

tion 1:

XHOT

= L

+ L

(1)

where L

is the loss function between the input x and

the result of generating haze I(x), this value is cal-

culated taking into account the 3 variables predicted

following Equation 2. Then, we can deﬁne L

as:

I(x) = J(x)t(x) + A(1 −t(x)) (2)

= MSELoss(I(x), x) (3)

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

502

In addition, the loss function for the A-Net sub-

network is expressed as:

= L

+ L

(4)

where L

is the loss function MSELoss between A

and A

, A

is the pre-calculated value of the im-

age with haze x using Dark Channel Prior (He et al.,

2009). Finally, the function L

is given following the

equation 5, where KL() denotes the Kullback-Leibler

divergence between two distributions:

= KL(N(µ

, σ

)||N(0, 1))

∑



(µ

)

+ (σ

)

− 1 − log(σ

)



(5)

3.1.2 YOLY

Considering a haze image, represented as x, the cen-

tral purpose is to restore the image without haze J(x)

without making use of information beyond what is

contained in the image itself. The essence of this

method is based on breaking down x into three sub-

networks, as illustrated in Figure 2. To be more pre-

cise, YOLY simultaneously channels x through three

main networks: the ﬁrst is designed to estimate J(x),

the second to estimate the transmission map T (x), and

the third focuses on estimating atmospheric light A.

Subsequently, the results of these networks are further

combined to reconstruct x in the upper layer, making

use of the atmospheric dispersion model (Equation 2).

Therefore, the model as a whole learns in an unsuper-

vised way. In summary, the goal is to minimize the

following loss function:

rec

= ||I(x) − x|| (6)

The cleared image J(x) is obtained by combining

the outputs generated by the three sub-networks, as

expressed in Equation 2. The loss function L

rec

was

designed to regulate the performance of the system

as a whole, encompassing both the individual sub-

networks and the reconstruction of the haze image

I(x) after calculating its components. More precisely,

this loss function monitors and guides the disentan-

gling process, and this is achieved by incorporating

the haze generation process.

Loss Function. In addition, YOLY proposes a new

loss function taking into account the HSV color space,

which arises based on the observation made by Zhu

et al. (2015), which indicates that the difference be-

tween brightness and saturation is close to zero in the

haze-free zones. To make use of this previous infor-

mation, they propose the following equation regard-

ing the prediction of J(x).

= ||V (J(x)) − S(J(x))|| (7)

Figure 2: The architecture of YOLY. Extracted from the

work developed by Li et al. (2021).

In addition, YOLY proposes a set of loss functions

for training the A-Net sub-network, including a regu-

larization function to avoid overﬁtting.

3.1.3 ZID

The approach followed by ZID is similar to YOLY (Li

et al., 2021), in terms of deinterlacing the problem

into capable simpler ones. However, ZID differs from

YOLY in two fundamental respects. First, a distinc-

tion is made in terms of the loss function used. Specif-

ically, ZID proposes a loss similar to that used in DCP

for J-Net training, while YOLY is based on observ-

ing the HSV color space. In addition to this, ZID

introduces a smooth regularization in the outputs of

T-Net and A-Net, as opposed to YOLY, which only

applies this regularization in A-Net. Second, network

architectures vary. ZID adopts a structure analogous

to the U-Net architecture, in contrast to YOLY, which

is based on a non-degenerate architecture.

3.2 XYZ Network

In the following subsections, we present the second

proposed approach, which is characterized by the in-

tegration of the advantages offered by the XHOT,

YOLY and ZID methods. It is important to note that

the three unsupervised methods retain their respective

loss functions, which will play a fundamental role

in the training process of the robust model we are

proposing.

3.2.1 Architecture

As shown in Figure 3, our approach makes use of the

methods presented by Li et al. (2020) and Li et al.

(2021). As seen in Section 3.1, the proposed meth-

ods attempt to deinterlace the effect of haze into lay-

ers simpler to calculate, this disentanglement is possi-

ble thanks to Equation 2. The atmospheric dispersion

model is composed of 3 unknown variables, and that

is why the vast majority of dehazing techniques focus

XYZ Unsupervised Network: A Robust Image Dehazing Approach

503

on ﬁnding these values. In the same way, XYZ makes

use of this disentanglement to be able to calculate the

haze free image, that is why our approach has two

sub-network groups and one unsupervised neuronal

sub-network, the gJ-Net group to calculate the clear

image J(x) and with this to train our neural network

with its loss function, gT-Net is in charge of calcu-

lating the transmission map T (x) and ﬁnally the sub-

network A-Net which estimates the atmospheric light

within the image.

• gJ-Net. This group is the result of the combina-

tion of XHOT, YOLY and ZID methods, where

XHOT presents a non-degenerative neural net-

work such as YOLY, which helps us preserve im-

portant details for the haze-free image, as it is the

case of the object shapes within the image with

haze and also preserve the current dimensions. In

addition, YOLY makes use of the space of colors

HSV to address the training of the unsupervised

network. This is thanks to the observations made

by Zhu et al. (2015), who formulate that in haze-

free areas the difference between brightness and

saturation is close to zero. Finally, ZID proposes

the training taking into account a loss function

based on Dark Channel Prior Loss (Golts et al.,

2020), also makes use of a degenerative neural

network of the type U-Net with skip connections.

• gT-Net. As in J-Net, this sub-network is com-

posed of the three methods seen previously, only

with the difference of the output of each method.

Independent of the method, they present the trans-

mission map of the image being processed.

• A-Net. This sub-network is responsible for cal-

culating the atmospheric light present in an image

(independent of the content) so we assume that

arises apart from a latent space resulting from a

Gaussian distribution, which is why we decided

to use only a neural network, speciﬁcally a Varia-

tional Autoencoder, detailed in Section 3.1.1.

3.2.2 Loss Function

To train our model composed of three unsupervised

methods, we use together a loss function taking into

account the output of each method as described in

Equation 8:

XYZ

= L

XHOT

+ L

YOLY

+ L

ZID

(8)

where L

XHOT

is the loss function for our XHOT

method, L

YOLY

is the loss function for YOLY and ﬁ-

nally L

ZID

is the loss function for ZID, we decided to

give the same weight to each loss function as the three

guide our training, taking into account the assump-

tions made, such as the case of the HSV color space,

Dark Channel Prior Loss, in addition to the XHOT

loss function, which turns out to be an improvement

of Golts et al. (2020) and Li et al. (2021).

3.3 Datasets

We conducted our experiments using the REalistic

Single Image Dehazing (RESIDE) dataset (Li et al.,

2019), a well-recognized resource for large-scale haze

removal evaluation. This dataset is composed of two

test subsets: SOTS and HSTS. The SOTS subset con-

sists of 500 indoor hazy images, which were syntheti-

cally generated using a physical model with manually

calibrated parameters. In contrast, the HSTS subset

contains 10 synthetic hazy images and 10 real-world

hazy images captured in diverse scenes. Addition-

ally, the RESIDE dataset offers training subsets: OTS

(Outdoor Training Set) comprises a total of 72,135

hazy images generated from 2,061 ground truth im-

ages. Meanwhile, ITS (Indoor Training Set) consists

of 13,990 hazy images generated from 1,399 ground

truth images.

As mentioned previously, the methods described

in Section 3.1 do not require a “training” dataset, but

only need a set of images to carry out the evaluation

of the method. These methods, based on Zero-Shot

Learning, eliminate the haze adapting according to

the number of iterations made for each image eval-

uated.

3.4 Metrics

To assess the performance of our model, as well as

other methods, we employed Structured Similarity In-

dexing Method (SSIM) and Peak-Signal-to-Noise Ra-

tio (PSNR) (Wang et al., 2004) as image quality anal-

ysis metrics, which are commonly used in haze re-

moval studies. SSIM quantiﬁes the structural simi-

larity between the haze-free reference image and the

output image obtained from our proposed technique.

Similarly, PSNR provides a measure of the relation-

ship between the maximum possible signal power and

the noise power present in the images.

These two metrics serve as indicators of the im-

age’s ﬁdelity to the haze-free reference. Models that

achieve higher values for these metrics indicate a

higher quality and greater similarity between the gen-

erated image and the haze-free reference.

4 RESULTS

In our experiments, we carry out both qualitative and

quantitative assessments, as detailed in Section 3.4.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

504

J(x)

T(x)

A(x)

ASM

Xhot

Yoly

Zid

Xhot

Yoly

Zid

gJ-Net

gT-Net

A-Net

I(x)

Figure 3: The architecture of our proposed XYZ approach. XYZ includes two groups of sub-networks: gJ-Net group of

sub-networks (XHOT, YOLY and ZID) for image estimation clean J(x), gT-Net group of sub-networks for transmission map

estimation T (x). In addition, XYZ maintains layer disentanglement for the estimate of A.

The test dataset includes manually generated syn-

thetic data as well as actual images provided by Li

et al. (2019). These assessments covered a variety of

aspects and metrics to provide a comprehensive view

of the performance of our imaging haze removal ap-

proach.

4.1 Quantitative Comparison

In this section, we present the results by applying our

approach of disentanglement of complex layers into

simpler layers, compared to the methods proposed

by Li et al. (2020) and Li et al. (2021). It it worth

mentioning that the methods with which we are mak-

ing the comparison make use of the disentanglement

approach of haze in simpler layers.

As can be seen in Table 1, we can see that

our XHOT network does not improve on SSIM and

PSNR metrics to ZID and YOLY methods. However,

this method still presents signiﬁcant advances for the

elimination of haze. On the other hand, our XYZ net-

work manages to improve the results present in the

SOTS data set, which contains images of closed envi-

ronments in relation to the other HSTS test data set.

This still does not improve the other two methods in

the metrics. We ﬁnd it important to clarify that al-

though our technique does not improve the state of

the art, this is in the second method with the best re-

sult.

4.2 Qualitative Comparison

In this section, we observe the visual results by elim-

inating haze in outdoor images and in a indoor envi-

ronment. Figures 4 and 6 show the results achieved

by the unsupervised learning-based methods XHOT,

ZID (Li et al., 2020), YOLY (Li et al., 2021) and

XYZ, respectively. The images visually show that the

results are close to the actual image in terms of color

and detail.

From Figure 4, we can observe that the XHOT

method presents better results visually in areas such

as the sky, since we do not make use directly of an

approximation by DCP that normally fails to distin-

guish the haze from the clouds. On the other hand

ZID presents the most artifacts for the same reason,

which uses an approximation by DCP. However, XYZ

attempts to make this distinction in a more coherent

manner, but still presents artifacts. Figure 5 shows

a set of real-world images of the HSTS dataset, from

which we can observe that XYZ manages to eliminate

the haze in large part of the image with haze, however

this still does not manage to eliminate the haze in its

entirety.

Figure 6 shows the elimination of haze in closed

environments, of which we can observe that the ZID

method behaves better in the third image, since it

does not lose the details of color in areas such as the

ﬂoor. However, XYZ achieves better results in images

where we have more light as is the case with the ﬁrst,

second and fourth images.

5 CONCLUSIONS

In this work, we present an approach to haze re-

moval based on disentanglement of complex layers

into simpler layers. We have developed an unsu-

pervised method, called XHOT, which is simple and

lightweight. While it is important to note that, to

date, this method does not surpass the leading ap-

proaches in the state of the art in unsupervised haze

removal. Additionally, we propose a group disentan-

glement approach using three unsupervised methods:

XYZ Unsupervised Network: A Robust Image Dehazing Approach

505

Table 1: Results for XHOT, YOLY, ZID and XYZ methods applied to the SOTS and HSTS datasets. The results are shown in

relation to SSIM and PSNR metrics.

Network

SOTS Indoor HSTS Outdoor

SSIM PSNR SSIM PSNR

ZID (Li et al., 2020) 0.815 18.313 0.851 21.650

YOLY (Li et al., 2021) 0.807 17.950 0.832 22.217

XHOT 0.803 17.860 0.829 21.430

XYZ 0.818 18.530 0.846 21.680

(a) Haze (b) XHOT (c) YOLY (d) ZID (e) XYZ (f) Ground Truth

Figure 4: Qualitative comparisons on HSTS Outdoor dataset for different methods.

(a) Haze (b) XHOT (c) YOLY (d) ZID (e) XYZ

Figure 5: Qualitative comparisons on HSTS Real World for different methods.

XHOT, YOLY and ZID.

This approach, called XYZ, represents an effec-

tive strategy that combines the advantages of these

three individual methods. XYZ results show signif-

icant improvements in image quality metrics, such as

SSIM and PSNR, supporting its effectiveness in haze

removal. In other words, both approaches presented

in this study address the challenge of haze removal

in an unsupervised manner. This is especially valu-

able as we overcome the limitation of the lack of real-

world paired images and eliminate the need to train

with an extensive dataset. Our methods address the

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

506

(a) Haze (b) XHOT (c) YOLY (d) ZID (e) XYZ (f) Ground Truth

Figure 6: Qualitative comparisons on SOTS Indoor dataset for different methods.

haze problem individually for each image, which rep-

resents a signiﬁcant advance in unsupervised haze re-

moval.

REFERENCES

Berman, D., Treibitz, T., and Avidan, S. (2016). Non-Local

Image Dehazing. In IEEE Conference on Computer

Vision and Pattern Recognition, pages 1674–1682.

Cai, B., Xu, X., Jia, K., Qing, C., and Tao, D. (2016).

DehazeNet: An End-to-End System for Single Image

Haze Removal. IEEE Transactions on Image Process-

ing, 25(11):5187–5198.

Chen, X., Li, Y., Kong, C., and Dai, L. (2022). Unpaired

Image Dehazing With Physical-Guided Restoration

and Depth-Guided Reﬁnement. IEEE Signal Process-

ing Letters, 29:587–591.

Dong, H., Pan, J., Xiang, L., Hu, Z., Zhang, X., Wang,

F., and Yang, M.-H. (2020). Multi-Scale Boosted

Dehazing Network With Dense Feature Fusion. In

IEEE/CVF Conference on Computer Vision and Pat-

tern Recognition, pages 2154–2164.

Engin, D., Genc, A., and Ekenel, H. (2018). Cycle-Dehaze:

Enhanced CycleGAN for Single Image Dehazing. In

IEEE/CVF Conference on Computer Vision and Pat-

tern Recognition Workshops, pages 938–9388, Los

Alamitos, CA, USA. IEEE Computer Society.

Golts, A., Freedman, D., and Elad, M. (2020). Unsuper-

vised Single Image Dehazing Using Dark Channel

Prior Loss. IEEE Transactions on Image Processing,

29.

He, K., Sun, J., and Tang, X. (2009). Single Image Haze

Removal Using Dark Channel Prior. In IEEE Con-

ference on Computer Vision and Pattern Recognition,

pages 1956–1963.

Kingma, D. and Welling, M. (2014). Auto-Encoding Vari-

ational Bayes. arXiv preprint arXiv:1312.6114, pages

1–14.

Li, B., Gou, Y., Gu, S., Liu, J. Z., Zhou, J. T., and Peng, X.

(2021). You Only Look Yourself: Unsupervised and

Untrained Single Image Dehazing Neural Network.

International Journal of Computer Vision, pages 1–

14.

Li, B., Gou, Y., Liu, J. Z., Zhu, H., Zhou, J. T., and Peng,

X. (2020). Zero-Shot Image Dehazing. IEEE Trans-

actions on Image Processing, 29:8457–8466.

Li, B., Ren, W., Fu, D., Tao, D., Feng, D., Zeng, W., and

Wang, Z. (2019). Benchmarking Single-Image De-

hazing and Beyond. IEEE Transactions on Image Pro-

cessing, 28(1):492–505.

Liu, X., Ma, Y., Shi, Z., and Chen, J. (2019). GridDe-

hazeNet: Attention-Based Multi-Scale Network for

Image Dehazing. In International Conference on

Computer Vision.

McCartney, E. (1978). Optics of the Atmosphere – Scat-

tering by Molecules and Particles. IEEE Journal of

Quantum Electronics, 14(9):698–699.

Qin, X., Wang, Z., Bai, Y., Xie, X., and Jia, H. (2020).

FFA-Net: Feature Fusion Attention Network for Sin-

gle Image Dehazing. AAAI Conference on Artiﬁcial

Intelligence, 34(7):11908–11915.

Tan, R. T. (2008). Visibility in Bad Weather from a Single

Image. In IEEE Conference on Computer Vision and

Pattern Recognition, pages 1–8.

Wang, E., Shu, S., and Fan, C. (2022a). CNN-based Single

Image Dehazing via Attention Module. In IEEE 5th

International Conference on Automation, Electronics

and Electrical Engineering, pages 683–687.

Wang, Y., Yan, X., Guan, D., Wei, M., Chen, Y., Zhang,

X.-P., and Li, J. (2022b). Cycle-SNSPGAN: To-

wards Real-World Image Dehazing via Cycle Spectral

Normalized Soft Likelihood Estimation Patch GAN.

IEEE Transactions on Intelligent Transportation Sys-

tems, 23(11):20368–20382.

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.

(2004). Image Quality Assessment: From Error Vis-

ibility to Structural Similarity. IEEE Transactions on

Image Processing, 13(4):600–612.

Zheng, Z., Ren, W., Cao, X., Hu, X., Wang, T., Song, F., and

Jia, X. (2021). Ultra-High-Deﬁnition Image Dehazing

via Multi-Guided Bilateral Learning. In IEEE/CVF

Conference on Computer Vision and Pattern Recogni-

tion, pages 16180–16189.

Zhu, Q., Mai, J., and Shao, L. (2015). A Fast Single Im-

age Haze Removal Algorithm Using Color Attenua-

tion Prior. IEEE Transactions on Image Processing,

24(11):3522–3533.

XYZ Unsupervised Network: A Robust Image Dehazing Approach

507