Restoration of Temporal Image Sequence

from a Single Image Captured by a Correlation Image Sensor

Kohei Kawade

, Akihiro Wakita

, Tastuya Yokota

, Hidekata Hontani

and Shigeru Ando

Computer Science and Engineering, Nagoya Institute of Technology, Gokiso-cho, Showa-ku,

Nagoya-shi, Aichi 466-8555, Japan

Richo Elemex Corporation, 3-69, Ida-cho, Okazaki-shi, Aichi, 444-8586, Japan

Keywords:

Optical Flow, Correlation Image Sensor.

Abstract:

We propose a method that restores a temporal image sequence, which describes how a scene temporally

changed during the exposure period, from a given still image captured by a correlation image sensor (CIS).

The restored images have higher spatial resolutions than the original still image, and the restored temporal

sequence would be useful for motion analysis in applications such as landmark tracking and video labeling.

The CIS is different from conventional image sensors because each pixel of the CIS can directly measure the

Fourier coefﬁcients of the temporal change of the light intensity observed during the exposure period. Given

a single image captured by the CIS, hence, one can restore the temporal image sequence by computing the

Fourier series of the temporal change of the light strength at each pixel. Through this temporal sequence

restoration, one can also reduce motion blur. The proposed method improves the performance of motion blur

reduction by estimating the Fourier coefﬁcients of the frequencies higher than the measured ones. In this

work, we show that the Fourier coefﬁcients of the higher frequencies can be estimated based on the optical

ﬂow constraint. Some experimental results with images captured by the CIS are demonstrated.

1 INTRODUCTION

Each pixel in a traditional image sensor measures the

temporal integration of light strength over an expo-

sure time. Because of the temporal integration, the

measured pixel value does not contain information

of the temporal change of the light strength gener-

ated during the exposure time. Moving objects or

moving cameras generate the temporal change of the

light strength, but a traditional camera fails to record

this temporal change and hence restoring the tempo-

ral change from a given still image is difﬁcult. In this

article, we propose a method that restores the tempo-

ral image sequence, which describes how the image

temporally changed during the exposure period, from

a given still image captured by a correlation image

sensor (CIS)(Ando et al., 1997)(Ando and Kimachi,

2003)(Wei et al., 2009)(Hontani et al., 2014). Several

methods have been proposed for motion estimation

or for temporal change restoration from a single im-

age mainly because a single still image does not have

enough information for restoring temporal changes.

As described above, the proposed method cap-

tures an image by using a CIS. Each pixel of a CIS

measures not only a temporal integration of the light

strength but also a temporal correlation between the

light strength and a reference temporal signal supplied

from the outside of the CIS to each pixel during an

exposure period. By using sinusoidal functions as the

references, one can measure a set of the Fourier coef-

ﬁcients of the light strength at each pixel. These mea-

surements of the Fourier coefﬁcients include the in-

formation on the temporal change of the light strength

during the exposure period and it was reported that the

CIS made a problem of the optical ﬂow computation

well-posed(Wei et al., 2009). The proposed method

restores the temporal change of the image intensity at

each pixel by using the measured Fourier coefﬁcients:

Just by computing the Fourier series of the tempo-

ral change at each pixel, one can approximately re-

store the temporal change of the image. The method

improves the accuracy of the approximation by us-

ing other information obtained though optical ﬂow

computation. The temporal change approximately re-

stored by the Fourier series is always periodic, but this

is not always true. In other words, if one restores the

temporal change of the light intensity only by using

the Fourier series, the restored light intensity at the

Kawade K., Wakita A., Yokota T., Hontani H. and Ando S.

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor.

DOI: 10.5220/0006115301810191

In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), pages 181-191

ISBN: 978-989-758-227-1

181

beginning of the exposure period is always the same

as the intensity at the end of the period. The proposed

method thus improves the accuracy by explicitly esti-

mating the difference between the light intensity at the

beginning and that at the end of the exposure period

through optical ﬂow computation as will be described

later. The proposed method can restore an instanta-

neous image that describes the spatial distribution of

the light intensity on the image sensor corresponding

to an arbitrary point of time during the exposure time.

The restored instantaneous image should have motion

blur less than the given original image. Our proposed

method has strong relationships with methods used

for motion blur reduction. The authors are interested

in the comparison of the performance of the motion

blur reduction between the proposed method and the

other existing methods, but the comparison is out of

the scope of this article. One of the strong points of

the proposed method is that it restores a series of im-

ages with higher spatial resolution, which represents

the temporal change of the images during the shutter

is open. In the next section, we describe methods of

temporal change restoration and of blind motion de-

blurring, which can be interpreted as the restoration of

the instantaneous image that describes the light inten-

sity at some speciﬁc point of time during the exposure

period.

2 RELATED WORK

Several methods that can restore the temporal change

of the light strength during the exposure time have

been proposed. To the best of our knowledge, all

those methods employ an active sensing strategy:

A coded light, which is synchronized with the tim-

ing of the camera shutter, is used when a still im-

age is captured and the temporal change of the light

strength is decoded by a temporal high-frequency pat-

tern of the shutter (Kadambi et al., 2013)(Velten et al.,

2013)(Heide et al., 2013). Such a camera can mea-

sure the depth in realtime and can restore the tempo-

ral change of the light strength during the exposure

time. In contrast to these methods, our method em-

ploys a passive sensing strategy: No encoded light

that is speciﬁc to the camera used is required.

From the view point of the passive sensing strat-

egy, the proposed method has strong relationships

with methods for blind motion deblurring. One can

divide the methods of the nonuniform blind deblur-

ring into two classes: In one class, a given image is

captured by a traditional image sensor. In the other

class, a nontraditional sensor is used for capturing

the image. The majority of the methods in the for-

mer class employ a Bayesian framework for remov-

ing motion blurs. These methods combine natural

image priors(Levin, 2006)(Fergus et al., 2006)(Shan

et al., 2008)(Cho and Lee, 2009)(Cai et al., 2012)(Xu

et al., 2013)(Deshpande and Patnaik, 2014), blurring

kernel priors(Jia, 2007)(Shan et al., 2008)(Cho and

Lee, 2009)(Xu and Jia, 2010)(Cai et al., 2012)(Xu

et al., 2013), or models of motions(Jia, 2007)(Shan

et al., 2007)(Whyte et al., 2012), and optimization

techniques to recover both the blurring kernel and the

deblurred latent image simultaneously.

The natural image priors include distribution mod-

els of spatial gradient magnitudes(Field, 1994)(Fer-

gus et al., 2006)(Shan et al., 2008) and sparse rep-

resentations based on local appearances(Deshpande

and Patnaik, 2014) or on the gradients(Shan et al.,

2008)(Xu and Jia, 2010)(Cai et al., 2012). The

blurring kernel priors play important roles especially

when the motion blurs result from camera shakes.

For example, a motion density function(Gupta et al.,

2010), spatially sparse local ﬁlter(Fergus et al., 2006),

camera geometry model(Whyte et al., 2012), and

framelet-based sparse representation(Cai et al., 2012)

are introduced as the kernel priors. Object motion

estimation is useful for removing motion blurs(Jia,

2007), and a parametric motion descriptor(Shan et al.,

2007) is used for the estimation from a single given

image. The proposed method also estimates optical

ﬂow from a single given image and uses it for remov-

ing motion blurs.

In the former class, nontraditional image sensors

are used for obtaining the additional information use-

ful for motion deblurring. Hybrid cameras(Nayar and

Ben-Ezra, 2004)(Tai et al., 2010), which detect cam-

era motion using data from a video camera attached

to a still camera, are used for motion deblurring. In-

ertial measurement sensors are also used for the esti-

mation of camera motion(Joshi et al., 2010). Among

the methods in this class, the proposed method is most

similar to the motion deblurring methods using coded

exposure photography(Raskar et al., 2006)(Agrawal

and Raskar, 2009)(McCloskey et al., 2012). Control-

ling the camera’s shutter open and close during the ex-

posure period with a binary pseudo-random sequence,

one obtains a broad-band motion blurring kernel that

preserves high-frequency spatial details. Combining

the estimation of objects’ motions, one can estimate

the blurring kernel and compute the latent nonblurred

image by deconvoluving the given motion blurred im-

age with the estimated kernel.

In coded exposure photography, the camera en-

codes the temporal change of incident light strength

at each pixel during the exposure period with the bi-

nary pseudo-random shutter pattern. A CIS used in

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

182

the proposed method, on the other hand, encodes the

temporal change with analog sinusoidal reference sig-

nals. As will be described later, a CIS has three chan-

nels each of which encodes the temporal change with

a different reference signal and makes the problem of

optical ﬂow computation well-posed. The proposed

method ﬁrst computes the optical ﬂow of a given sin-

gle image and then uses the optical ﬂow for restoring

higher frequency components of the temporal change

of the light strength during the exposure period.

The contributions of this study are as follows: (1)

A completely passive method is proposed for restor-

ing the temporal change of the light strength during

the exposure period from a single still image captured

by a CIS, (2) To improve the quality of the restored

images, a method is proposed for computing the tem-

poral frequency components that are higher than those

of the reference signals, and (3) It is shown that the

restoration can reduce motion blur.

3 CORRELATION IMAGE

SENSOR (CIS)

Let exposure time be denoted by T and let the time

during which the shutter is open be denoted by t,

where 0 ≤ t ≤ T . Let the coordinates on an image

sensor be denoted by x = (x, y)

and let f (t,x) denote

the strength of the incident light that comes from the

lens system to the pixel at location x at time t. The

location variable, x, is often omitted as f (t) = f (t,x)

when the location is clearly known from the context.

The CIS used in this study has three channels and

a pixel value of the s-th channel (s ∈ {1,2,3}) denotes

the temporal correlation between the light strength,

f (t, x), and a reference signal, r

(t), which is supplied

from the outside of the CIS to all pixels as

(x) =

∫

f (t, x)r

(t)dt.

Let r

(t) = cos(nω

t + 2(s − 1)π/3) + 1/3 where

= 2π/T and n ∈ Z. Then, we can mea-

sure the following three values at each pixel si-

multaneously(Ando et al., 1997)(Ando and Kimachi,

2003)(Wei et al., 2009) as follows:

(x) =

∫

f (t, x)dt, (1)

n,R

(x) =

∫

f (t, x)cos(nω

t)dt, (2)

and

n,I

(x) =

∫

f (t, x)sin(nω

t)dt, (3)

where g

n,R

(x) and g

n,I

(x) are the real and imaginary

parts of a complex Fourier coefﬁcient g

that corre-

sponds to the frequency n/T . It should be noted that

no image sensor can measure the light strength f (t,x)

directly and traditional image sensors can measure

only g

(x). In this manuscript, we set n = 1 and a CIS

sensor measures g

(x) and g

(x) = g

1,R

(x)+ jg

1,I

(x),

where j is the unit imaginary number.

3.1 Optical Flow Computation with a

CIS

We can compute an optical ﬂow v(x) = (v

(x),v

(x))

from a single image measured by the CIS. In the com-

putation, we assume that the light strength arriving

from each point of an object is invariant with respect

to time and that the following equation is satisﬁed

when the shutter is open, t ∈ [0,T ]:

f (t, x) =



∂

∂x

+ v

∂

∂y

∂

∂t



f (t, x) = 0. (4)

Integrating (4) over the exposure time with a weight

− jnω

/T where n ∈ Z, we obtain the following equa-

tion:

0 =

∫



∂

∂x

+ v

∂

∂y

∂

∂t



f (t,x)e

− jnω

∫



∂

∂x

+ v

∂

∂y



f (t,x)e

− jnω

∂

∂t

f (t,x)e

jnω



∂

∂x

+ v

∂

∂y



(x)



[ f (t, x)e

jnω

]

+ jnω

∫

f (t,x)e

jnω





∂

∂x

+ v

∂

∂y



(x) +F

(x) + jnω

(x),

(5)

where

(

)

is an image of the Fourier coefﬁcients of

temporal signals f (t,x) such that

(x) =

∫

f (t, x)e

− jnω

d t (6)

and F

denotes the difference of the boundary val-

ues such that F

(x) = [ f (t,x)]

/T = { f (T,x) −

f (0,x)}/T .

As described above, a CIS can measure both g

(x)

and g

(x) at each pixel. Substituting n = 0 and n = 1

to (5), we obtain two different equations. Eliminat-

ing the variable F

from these two equations, we can

derive a linear complex equation as follows:



∂

∂x

+ v

∂

∂y



{

(x) − g

(x)

}

+ jω

(x) = 0.

(7)

Equation (7) consists of two real coefﬁcient equations

and we can derive the following linear equation of v:

Av = d, (8)

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor

183

where d = [ω

1,I

,−ω

1,R

]

and

A =



∂

1,R

+ g

} ∂

1,R

+ g

}

∂

1,I

∂

1,I



. (9)

One can estimate v(x) by solving the linear equation

shown in (8). It should be noted that the problem of

optical ﬂow computation is well-posed when one cap-

tures images with a CIS though the problem is ill-

posed when one captures images with a traditional

image sensor. The added measurements, g

(x), can

introduce extra information that constrains the solu-

tion.

Substituting n = 0 and n = 1 to (5), we obtain the

following two equations:



∂

∂x

+ v

∂

∂y



(x) + F

(x) = 0, (10)



∂

∂x

+ v

∂

∂y



(x) + F

(x) − ω

1,I

= 0. (11)

Solving the system of linear equations (10) and (11),

we can obtain the following equation.

(x) = −



∂

∂x

+ v

∂

∂y



1,R

(x) + g

(x))

1,I

(x). (12)

Once we compute the ﬂow v(x), the difference of the

boundary values F

(x) can be estimated by solving

(12). The values, v

, v

, and F

, computed at each

pixel are used in the restoration process described in

the next section.

4 RESTORATION OF

TEMPORAL CHANGE OF

LIGHT STRENGTH DURING

EXPOSURE PERIOD

The objective is to restore the temporal change of the

strength of light strength, f (t,x) (0 ≤ t ≤ T ), at each

pixel from the measurements, g

(x) and g

(x), ob-

tained by a CIS. It is not difﬁcult to approximately

restore f (t, x) by using the N-th order Fourier series

expansion of f (t) = f (t,x) (N ∈ N) such that

(t) =

∑

n=−N

jnω

, (13)

where g

(n ∈ Z) are the Fourier coefﬁcients as de-

ﬁned in (6). As g

−n

= g

∗

holds when f (t) is a real

function, the original signal f (t) can be approximated

by using the Fourier coefﬁcients g

and g

measured

by a CIS as follows:

(t) =

∑

n=−1

jnω

= g

+ 2Re[g

jω

], (14)

where Re[c] denotes the real part of a complex num-

ber c. The proposed method restores f (t) by improv-

ing the approximation

(t) by restoring (i) the differ-

ence of the boundary values f (T ) − f (0), and (ii) the

higher frequency components g

(n > 1) that are not

measured by the CIS.

4.1 Restoration of Boundary Values

(0) and

(T ) approximate f (0) and f (T ) inaccu-

rately in general.

(0) =

(T ) always holds be-

cause

(t) shown in (13) is a linear combination

of e

jnω

(n = 0,1,. .. ,N) that are all periodic func-

tions with period T . It leads to

(x) = {

(T ) −

(0)}/T = 0, which is not always consistent with

the true value of F

. The value of

(t) changes ex-

tremely rapidly near the boundaries t = 0 and t = T .

The left panel of Fig.1 shows examples of the signals

restored by g

(n = 1,2,. .. ). A true and unobservable

signal f (t), where f (T ) ̸= f (0), is shown at the top of

Fig.1. As N increases, the resultant

(t) converges to

the original one, but we can notice that

(T ) =

(0)

is always satisﬁed and can see the rapid change near

the measurement boundaries t = 0 and t = T .

The proposed method compensates for the rapid

changes by adding a new function s(t) to

(t) so that

the resultant function

(t) =

(t) + s(t) is consis-

tent with the value of F

estimated by using (12) and

with the Fourier coefﬁcients g

(x) that are measured

(or would be computed) from f (t, x). The consistency

with respect to the Fourier coefﬁcients requires s(t) to

satisfy the following equation:

∫

(t)e

− jnω

d t =

∫

(t)e

− jnω

dt. (15)

This leads to

∫

s(t)e

− jnω

dt = 0. (16)

Substituting n = 0, we notice that s(t) must be anti-

symmetric with respect to the reﬂection at the center

time of the exposure period t = T /2. Among the anti-

symmetric functions, we select the following equation

because it is consistent with the value of F

s(t) = s

(t) = F

(t − T /2) +

∑

n=1

sin(nω

(17)

which always satisﬁes s

(T )−s

(0) = F

. The coef-

ﬁcients proposed in (17), a

, are scalar and their val-

ues should be determined so that

(t) is consistent

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

184

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.5

0.5

1.5

TIME

f (t)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.5

0.5

1.5

TIME

(a)

(t)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.5

0.5

1.5

TIME

(b)

(t)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.5

0.5

1.5

TIME

(c)

(t)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.5

0.5

1.5

TIME

(d)

(t)

Figure 1: Examples of the approximations. An original signal f (t), the Fourier approximations

(t) shown in (13), and the

modiﬁed approximation

(t) in (19) are indicated.

−5 −4 −3 −2 −1 0 1 2 3 4 5

−0.5

0.5

1.5

Frequency [1/vT]

(A)

−5 −4 −3 −2 −1 0 1 2 3 4 5

−0.5

0.5

1.5

Frequency [1/vT]

(B)

Figure 2: (A) A graph of H

along the direction of v. (B) Graphs of H

(black), H

(red), and H

∗

(red). H

improves the

bandwidth of the resultant ﬁlter (blue).

with g

(n ≥ 1). Solving (16) with respect to a

, we

get

nω

, (n = 1, 2,. ..,N). (18)

As a result, the proposed method restores the latent

image by using the following equation:

(t,x) = g

(x) + 2

∑

n=1

Re[g

(x)e

jnω

]



(t − T /2) +

∑

n=1

nω

sin(nω



(19)

(a) and (c) of Fig.1 show examples of

(t) that ap-

proximate f (t), which is a step function in this exam-

ple. Comparing (b) and (d) of Fig.1 respectively, we

observe that the rapid changes near the measurement

boundaries are suppressed.

4.2 Restoration of Higher Frequency

Components

The number of channels of a CIS is limited to three

and we use them for measuring the lower temporal

frequency components of f (t). Restoring the higher

temporal frequency components of f (t, x), one can re-

store higher spatial frequency components of f (t, x)

and can deblur the motion-blurred gray-scale image

(x) more crisply. As will be described later, one can

restore the higher temporal frequency components of

f (t, x) by using (5). Before the restoration algorithm

is described, the relationships between the temporal

frequencies of f (t) = f (t, x) and the spatial frequen-

cies of g

(x) are discussed.

4.2.1 Spatial Motion Blur in Fourier Coefﬁcient

Image

Let the velocity of a moving target in an image be

denoted by v = (v

)

= v(cosθ,sin θ)

, where v

denotes the speed and θ denotes the angle between

v and the x-axis. The motion blur generated on

the gray-scale image g

(x) by the target motion can

be represented by a spatial convolution with a one-

dimensional box ﬁlter h

(x) that averages the inten-

sities along a line segment of length vT , which is a

trajectory of the moving target:

(x) = f (x) ∗ h

(x), (20)

where ∗ denotes a spatial convolution,

(x) =

rect



x cosθ + ysin θ



δ(v

⊥

x), (21)

⊥

denotes a unit vector perpendicular to v,

rect(x) =



1, if |x| ≤ 1/2,

0, otherwise,

(22)

and f (x) = f (t = T /2,x). We ﬁx the time to the center

of the exposure time t = T /2 for avoiding the effects

of the temporal boundaries of the measurements.

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor

185

f (1,x)

f (15,x)

f (30,x)

(x)

1,R

(x)

1,I

(x)

Figure 3: Top: Examples of the still images

f (k, x). Bottom: Generated images, g

(x), g

1,R

(x), and g

1,I

(x).

A motion blur generated on the complex Fourier

coefﬁcient image g

(x) by the target motion can be

represented by a spatial convolution with a differ-

ent one-dimensional ﬁlter h

(x). Let x = vt de-

note the location of the moving target in the image,

where the origin of the coordinate system is tem-

porarily set to the target’s location at t = 0. Then,

multiplying v

from the left, we obtains the equa-

tion t = v

x/v

= (v

cosθ + v

sinθ)/v. Substituting

this equation and ω

= 2π/T into the representation

of the reference temporal signal, e

jnω

, we obtain a

spatial ﬁlter that corresponds to the reference signal

2π jn(v

cosθ+v

sinθ)/vT

. The proﬁle of this spatial ﬁlter

along the trajectory of the moving target is a complex

sinusoidal curve with frequency n/vT . Multiplying

this spatial ﬁlter with h

, we get h

(x), where

(x) = h

(x)e

2π jn(xcosθ+ysinθ)/vT

, (23)

and this spatial ﬁlter generates the Fourier coefﬁcient

image, g

(x), as given by the following equation:

(x) = f (x) ∗ h

(x). (24)

Let the spatial Fourier transformation of f (x) be

denoted by F(u), where u = (u

)

denotes the

two-dimensional spatial frequency and let the Fourier

transformation of

(x) in (14) be denoted by

(u).

We can now derive the following equation from (14)

(u) = F(u)

{

(u) + H

∗

(−u)

}

, (25)

where H

(u) denotes the Fourier transformation of

(x) and is given as

(u) = H

(u) ∗ δ(u

cosθ+ u

sinθ+ n/vT ), (26)

where ∗ is now a convolution with respect to the

frequencies. The Fourier transformation of the one-

dimensional box ﬁlter, h

(x), is given as a sinc func-

tion such that

(u) =

sin(π(T v

+ T v

))

π(T v

+ T v

)

. (27)

Finally, we obtain

(u) =

sin(π(T v

+ T v

+ n))

π(T v

+ T v

+ n)

. (28)

Figure 2 shows the proﬁles of H

and of H

along a

line parallel to the motion direction. As shown by the

proﬁle of H

, h

is a low-pass ﬁlter and generates the

spatial motion blur of a moving target in g

(x). The

graph of H

is obtained by shifting that of H

by n

toward the motion direction and h

is a band-pass ﬁl-

ter of which the spatial center frequency is n/vT . The

approximation

(x) shown in (19) is computed with

and g

and the latter measurement g

improves the

spatial bandwidth by adding the two band-pass ﬁlters

and H

∗

to the low-pass ﬁlter H

as shown in (25).

It should be noted that one can increase the band-

width of the restoration and can obtain crisper images

if one can estimate the higher frequency components

(x) (n > 1) and approximates as follows:

(t,x) =

∑

n=−N

(x)e

jnω

, (29)

where N ≥ 1. The bandwidth of

(t) is wider than

that of

(t) when N ≥ 1 and the restoration accuracy

is improved by the addition of g

(n ≥ 1). The method

for the estimation of g

(x) is described next.

4.2.2 Estimation with Sparsity Regularization

The proposed method estimates g

(x) (n > 1) by us-

ing (5). As mentioned above, the values of v

(x),

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

186

(x)

(t = 0)

(t = 7/30)

(t = 15/30)

(t = 22/30)

(x)

v(x)

(

)

(t = 0)

(t = T /3)

(t = 2T /3)

Figure 4: Top row: g

(x) artiﬁcially generated from M = 30 still images shown in Fig.3 and

(t,x) for each t. The Second to

Bottom rows: g

(x) measured by a CIS and the results, v(x), F

(x), and

(t,x), computed from the images captured by the

CIS.

(x), and F

(x) can be estimated by using the mea-

sured values of g

(x) and g

(x). No other measure-

ments are needed for the estimation. Once these val-

ues are estimated, g

(x) is the only unknown variable

in the linear complex equation shown in (5) and one

can estimate its value by solving the equation.

These estimated values, though, can be inaccurate

especially when the differential equation (5) does not

hold as some irregular events like occlusions or spec-

ular reﬂections occur. We need a robust estimation

method that automatically detects and excludes data

that do not obey an employed model. Assuming that

the regions in which the equation (5) does not hold

are sparse in a given image, the proposed method uses

a regularization technique proposed in (Ayvaci et al.,

2012) for making the estimation robust against such

irregulars.

Let e(x) denote a residual of the right hand side of

(5) deﬁned as

e(x)



∂

∂x

+ v

∂

∂y



(x) + F

(x) + jnω

(x).

(30)

Let D denote the entire image domain and let Ω de-

note subregions in D in which (5) does not hold. We

assume that the residual e(x) obeys a normal distribu-

tion with zero mean and small variance, if x ∈ D \ Ω,

where (5) is satisﬁed. If x ∈ Ω, on the other hand,

(5) is not satisﬁed and the residual, e(x), can have an

arbitrary value, ρ(x).

Let e(x) = e

(x) + e

(x) such that

(x) =



ρ(x), x ∈ Ω,

0, x ∈ D \ Ω,

(31)

and

(x) =



0, x ∈ Ω,

N (x), x ∈ D \ Ω,

(32)

where N denotes a variable that obeys the normal dis-

tribution. Then, e

is large and sparse, and e

is small

and dense. Based on the discussion above, the pro-

posed method minimizes the following cost function:

) = ∥e

∥

(D)

+ α∥e

∥

+ β∥g

(x)∥

∥

∇)g

+ F

+ jnω

− e

∥

(D)

+ α∥e

∥

+ β∥g

∥

(33)

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor

187

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

f (t, x

left

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

left

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

left

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

f (t, x

middle

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

middle

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

middle

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

f (t, x

right

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

right

)

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

100

150

200

250

(t,x

right

)

Figure 5: Left Panel: The location of three pixels indicated by three dots (yellow (x

left

), red (x

middle

). Right Panel: Examples

of f (t = (k − 1)/M) (left column),

N=1

(t) (middle column), and

N=3

(t) (right column) at the three pixels.

where α and β are positive scalar coefﬁcients for the

regularization terms and their values are experimen-

tally determined. J

) is convex with respect to

and e

when v and F

are ﬁxed and then one can

obtain the unique solution of g

(x) and e

(x).

5 EXPERIMENTAL RESULTS

The performance was evaluated with sets of simulated

data and with images captured by a CIS.

5.1 Experiments with Simulated Data

Sets of M still images of a stationary scene,

{

f (k,x)|k = 1, 2,..., M}, were captured by a slowly

translating traditional camera for simulating f (t,x)

(0 ≤ t < 1) and artiﬁcially generated g

(x) and g

(x)

from each of the sets as follows:

(x) =

∑

k=1

f (k,x), (34)

1,R

(x) =

∑

k=1

f (k,x) cos(2π(k − 1)/M),(35)

1,I

(x) =

∑

k=1

f (k,x) sin(2π(k − 1)/M). (36)

Figure 3 shows examples of

f (k,x) (M = 30) and

the corresponding g

(x) and g

(x). One can see

the motion blurs in g

(x). The top row in Fig. 4

shows g

(x) again and some restored latent images,

((k − 1)/M,x) (k = 0,7, 15,22). Comparing with

(x) shown in the left, we can see that the proposed

method suppressed the motion blur in g

(x) and re-

stored the temporal change of the light strength during

the exposure time by computing

(t,x). Examples of

the true signal f (t = (k − 1)/M,x) and the restored

signals

(t,x) at three neighboring pixels are shown

in Fig.5. The locations of the three pixels are shown

at the left panel in the ﬁgure. They were located near

the right boundary of a tree, which moved toward the

right. The left column in the right panel shows the true

proﬁles of f (t, x) at the three points x

left

, x

middle

, and

right

. The value of f decreased rapidly when the tree,

which moved from left to right in the image, reached

to each pixel. The graphs in the middle column and in

the right column show the restored temporal change

of the light strength at each pixel with N = 1 (mid-

dle column) and with N = 3 (right column) as shown

in (19), respectively. The latent true signals f (t, x) in

the left column were smooth enough and no higher

frequency components were required for describing

the true signal. Hence,

(t,x), which consists of only

lower frequency components, approximated the latent

signal more accurately than

(t,x), in which one can

see some artifacts like aliasing. It is included in our

future works to adaptively determine the appropriate

value of N in (19) for each pixel.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

188

(x)

(t = 0, x)

(T /3,x )

(2T /3,x )

Figure 6: Enlarged parts of the images shown in Fig.4

5.2 Experiments with Images Captured

by a CIS

A set of images was captured by the CIS, and the im-

age size was 512 × 704, with the exposure time set

as T = 1/30 s. Examples of g

(x) are shown at the

leftmost panels in the second to bottom rows of Fig.4.

Firstly, solving the linear equation (8), we computed

the ﬂow v, and the difference of the boundary values

(x) for each image. Examples of the obtained re-

sults are shown in Fig.4. As shown, F

(v) had nonzero

values around the regions corresponding to the mo-

tion blurs. Then, minimizing J(g

) in (33), the

method computed g

(x) (n = 2, 3) and e

. Using the

estimated values of g

(n = 2, 3) with the measured

values, g

and g

, the method restored images

(t,x)

as shown in the middle and the bottom rows in Fig.4.

Enlarged parts of the images, g

(x) and

(t,x), are

shown in Fig.6. Comparing with g

(x) at the leftmost

panel, we can see that the restored images include less

motion blurs.

6 SUMMARY AND FUTURE

WORK

We proposed a CIS-based method that removes mo-

tion blurs from a single image and restores the la-

tent temporal images, which represent the temporal

change of the light strength during the exposure pe-

riod. We believe that our proposed method would

largely improve the stability and accuracy of motion

analysis including landmark tracking or optical ﬂow

computation that are crucial in medical image analy-

sis.

One advantage of the proposed method is found

especially in the restoration of the latent temporal im-

ages. The restoration of the temporal change of the

light strength during the exposure time from a single

given image is very difﬁcult when the image is cap-

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor

189

tured with a traditional image sensor. On the other

hand, a CIS modulates the temporal change of the

light strength at each pixel with the sinusoidal refer-

ence signals and records the temporal change with its

Fourier coefﬁcients. Using these coefﬁcients, one can

compute optical ﬂow, v(x), and the difference of the

boundary values, F

(x), from a single image captured

by a CIS. Once one obtains v(x) and F

(x), one can

restore the higher temporal frequency components,

(x) (n ≥ 2), based mainly on the optical ﬂow con-

straint, which represents the temporal invariance of

the strength of an incident light coming from an ob-

ject point.

The biggest limitation of the proposed method is

that the optical ﬂow constraint (5) used in the pro-

posed method assumes that a ﬂow observed at each

pixel, v(x), is constant with respect to time during

the exposure time. This is not true especially when

the motion blurs are generated with a high-frequency

motion such as a hand shake. Many blind motion de-

blurring methods can estimate spatial blurring kernels

from a single blurred image by introducing the prior

knowledge on natural images and/or on kernels. The

future works would include the use of the strategies

employed by those blind motion deblurring methods

for estimating the spatial blurring kernels that are con-

sistent not only with the blurred image g

(x) but also

with the Fourier coefﬁcient image, g

(x), so that one

can restore more accurate and crisp images that rep-

resent the temporal change during the exposure time.

REFERENCES

Agrawal, A. and Raskar, R. (2009). Optimal single image

capture for motion deblurring. In Computer Vision

and Pattern Recognition, 2009. CVPR 2009. IEEE

Conference on , pages 2560–2567. IEEE.

Ando, S. and Kimachi, A. (2003). Correlation im-

age sensor: Two-dimensional matched detection of

amplitude-modulated light. Electron Devices, IEEE

Transactions on, 50(10):2059–2066.

Ando, S., Nakamura, T., and Sakaguchi, T. (1997). Ul-

trafast correlation image sensor: concept, design,

and applications. In Solid State Sensors and Actu-

ators, 1997. TRANSDUCERS’97 Chicago., 1997 In-

ternational Conference on, volume 1, pages 307–310.

IEEE.

Ayvaci, A., Raptis, M., and Soatto, S. (2012). Sparse occlu-

sion detection with optical ﬂow. International Journal

of Computer Vision, 97(3):322–338.

Cai, J.-F., Ji, H., Liu, C., and Shen, Z. (2012). Framelet-

based blind motion deblurring from a single image.

Image Processing, IEEE Transactions on, 21(2):562–

572.

Cho, S. and Lee, S. (2009). Fast motion deblurring. In

ACM Transactions on Graphics (TOG), volume 28,

page 145. ACM.

Deshpande, A. M. and Patnaik, S. (2014). Uniform and non-

uniform single image deblurring based on sparse rep-

resentation and adaptive dictionary learning. The In-

ternational Journal of Multimedia & Its Applications

(IJMA), 6(01):47–60.

Fergus, R., Singh, B., Hertzmann, A., Roweis, S. T., and

Freeman, W. T. (2006). Removing camera shake from

a single photograph. ACM Transactions on Graphics

(TOG), 25(3):787–794.

Field, D. J. (1994). What is the goal of sensory coding?

Neural computation, 6(4):559–601.

Gupta, A., Joshi, N., Zitnick, C. L., Cohen, M., and Cur-

less, B. (2010). Single image deblurring using motion

density functions. In Computer Vision–ECCV 2010,

pages 171–184. Springer.

Heide, F., Hullin, M. B., Gregson, J., and Heidrich, W.

(2013). Low-budget transient imaging using photonic

mixer devices. ACM Transactions on Graphics (ToG),

32(4):45.

Hontani, H., Oishi, G., and Kitagawa, T. (2014). Local es-

timation of high velocity optical ﬂow with correlation

image sensor. In Computer Vision–ECCV 2014, pages

235–249. Springer.

Jia, J. (2007). Single image motion deblurring using trans-

parency. In Computer Vision and Pattern Recogni-

tion, 2007. CVPR’07. IEEE Conference on, pages 1–

8. IEEE.

Joshi, N., Kang, S. B., Zitnick, C. L., and Szeliski, R.

(2010). Image deblurring using inertial measurement

sensors. In ACM Transactions on Graphics (TOG),

volume 29, page 30. ACM.

Kadambi, A., Whyte, R., Bhandari, A., Streeter, L., Barsi,

C., Dorrington, A., and Raskar, R. (2013). Coded

time of ﬂight cameras: sparse deconvolution to ad-

dress multipath interference and recover time proﬁles.

ACM Transactions on Graphics (TOG), 32(6):167.

Levin, A. (2006). Blind motion deblurring using image

statistics. In Advances in Neural Information Process-

ing Systems, pages 841–848.

McCloskey, S., Ding, Y., and Yu, J. (2012). Design and esti-

mation of coded exposure point spread functions. Pat-

tern Analysis and Machine Intelligence, IEEE Trans-

actions on, 34(10):2071–2077.

Nayar, S. K. and Ben-Ezra, M. (2004). Motion-based mo-

tion deblurring. Pattern Analysis and Machine Intelli-

gence, IEEE Transactions on, 26(6):689–698.

Raskar, R., Agrawal, A., and Tumblin, J. (2006). Coded

exposure photography: motion deblurring using ﬂut-

tered shutter. ACM Transactions on Graphics (TOG),

25(3):795–804.

Shan, Q., Jia, J., and Agarwala, A. (2008). High-quality mo-

tion deblurring from a single image. In ACM Transac-

tions on Graphics (TOG), volume 27, page 73. ACM.

Shan, Q., Xiong, W., and Jia, J. (2007). Rotational motion

deblurring of a rigid object from a single image. In

Computer Vision, 2007. ICCV 2007. IEEE 11th Inter-

national Conference on, pages 1–8. IEEE.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

190

Tai, Y.-W., Du, H., Brown, M. S., and Lin, S. (2010).

Correction of spatially varying image and video mo-

tion blur using a hybrid camera. Pattern Analy-

sis and Machine Intelligence, IEEE Transactions on,

32(6):1012–1028.

Velten, A., Wu, D., Jarabo, A., Masia, B., Barsi, C., Joshi,

C., Lawson, E., Bawendi, M., Gutierrez, D., and

Raskar, R. (2013). Femto-photography: capturing and

visualizing the propagation of light. ACM Transac-

tions on Graphics (TOG), 32(4):44.

Wei, D., Masurel, P., Kurihara, T., and Ando, S. (2009).

Optical ﬂow determination with complex-sinusoidally

modulated imaging. relation, 7(8):9.

Whyte, O., Sivic, J., Zisserman, A., and Ponce, J. (2012).

Non-uniform deblurring for shaken images. Interna-

tional journal of computer vision, 98(2):168–186.

Xu, L. and Jia, J. (2010). Two-phase kernel estimation for

robust motion deblurring. In Computer Vision–ECCV

2010, pages 157–170. Springer.

Xu, L., Zheng, S., and Jia, J. (2013). Unnatural l0 sparse

representation for natural image deblurring. In Com-

puter Vision and Pattern Recognition (CVPR), 2013

IEEE Conference on, pages 1107–1114. IEEE.

Restoration of Temporal Image Sequence from a Single Image Captured by a Correlation Image Sensor

191