Focus-aid Signal for Ultra High Deﬁnition Cameras

Seiichi Gohshi

and Hidetoshi Ito

Kogakuin University, 1-24-2, Nishi-Shinjuku, Shinjuku0ku, Tokyo, Japan

Leader Corporation, 2-6-33, Tsunasima Higashi, Kouhoku-Ku, Yokohama-Shi, Kanagawa, Japan

Keywords:

Focus, Super Resolution, Nonlinear Signal Processing, Edge Detection.

Abstract:

4K and 8K systems are very promising media and offer highly realistic images. Such high-resolution video

systems provide completely different impressions than HDTVs. However, it is difﬁcult, even for a professional

cameraman, to adjust the 4K/8K camera focus using only the small viewﬁnder on a camera. Indeed, it is

sometimes difﬁcult even to focus an HDTV camera with such a small viewﬁnder, and since 4K has four times

higher resolution than HDTV, it is almost impossible to adjust a small viewﬁnder with the same size as that of

an HDTV camera using only human eyes. Therefore, in content-creating ﬁelds, large monitors are generally

used to adjust the focus; however, large monitors are bulky and do not ﬁt practical requirements, which means

that technical assistance is required. A possible solution to this problem is to detect the sharp edges created by

high-frequency elements in ﬁne-focus images and superimpose those edges on the image; the cameraman can

then adjust the focus with additional information gained from maximizing the superimposed edges. However,

conventional edge detection technologies are vulnerable against noise, which means that practical situations

using this technique are limited to environments with good lighting conditions. This paper introduces a novel

signal processing method that enables cameramen to adjust a 4k camera focus using their eyes.

1 INTRODUCTION

4K and 8K video systems provide highly realistic ex-

periences and are said to be the ultimate in TV sys-

tems. 4K TVs and 4K video cameras are currently

sold in stores, and experimental 4K broadcasting has

begun in Japan; moreover, an 8K (SHV) broadcast-

ing service is planned for 2016. However, 4K/8K

content is still not sufﬁcient and current professional

4K/8K equipment is bulky and heavy. To create more

4K/8K content, the size of the equipment needs to

be reduced. However, even if their size were re-

duced, there would still be the problem of focusing

the 4K/8K camera. 4K and 8K has four and six-

teen times higher resolution than full HDTV, respec-

tively. However, the viewﬁnders on these cameras are

small. Professional cameras do not have auto-focus

functions because professional camera persons are ca-

pable of adjusting the ﬁne focus and complex focus

controls. Producers sometimes use blurry scenes and

gradual temporal focus in and out of scenes. There

are many focus techniques in the content production.

The professional camera persons were able to cope

with these difﬁcult requirements until HDTV content

production. However it is very difﬁcult to manually

adjust the focus using only the viewﬁnder found on

4K/8K cameras, and if the focus is off, 4K/8K cannot

live up to their full potential.

To solve this problem, large 4K liquid crystal dis-

plays (LCDs) are used in the ﬁeld to adjust the fo-

cus. These displays are bulky and sometimes impos-

sible to use in small places. Although small moni-

tors are handy, they are insufﬁcient for adjusting high-

resolution 4K/8K. Thus, a new technology that can

help human eyes adjust the focus would be much ap-

preciated. Edges in a frame can be an indicator of

the focus. Edges have their highest frequency ele-

ments when the focus is adjusted. It is not difﬁcult

to detect edges with a digital high-pass ﬁlter (HPF).

Thus, one possible solution would be superimposing

the edges of the image on the viewﬁnder. Adjusting

the focus on small viewﬁnders would be easier if the

edges were pronounced, and adjusting the focus to

maximize edges is not difﬁcult with small viewﬁnd-

ers; therefore, this method can provide a better fo-

cus point than just concentrating on the viewﬁnder’s

regular image. However, this method has limitations.

Video always has noise, which creates false edges;

these false edges interfere with adjusting the focus.

Furthermore, the edges detected by conventional, dig-

ital HPF are thick and have low levels that can be

compared with noise. The thick and low edges re-

176

Gohshi, S. and Ito, H.

Focus-aid Signal for Ultra High Deﬁnition Cameras.

DOI: 10.5220/0005750901740179

In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 3: VISAPP, pages 176-181

ISBN: 978-989-758-175-5

gardless off the image being out of focus. Sharp, high

edges are required to adjust the focus, and these ap-

pear at the ﬁne focus position. Thin edges can be gen-

erated if the edges have high-frequency elements. Su-

per resolution (SR) is one method for creating high-

frequency elements. However, most SR technologies

cannot work in real time (Farsiu et al., 2004)(Park

et al., 2003)(Katsaggelos et al., 2010)(van Eekeren

et al., 2010)(Freeman et al., 2000)(Freeman and Liu.,

2011)(Zhu et al., 2014).

SR with nonlinear signal processing (NLSP) has

been recently proposed. NLSP can work in real time

and can create higher-frequency elements than the

original image. These high-frequency elements can

create thin edges, and their quantity rapidly decreases

except for the ﬁne focus point. NLSP also has high

tolerance against noise. This paper discusses focus

adjustment for 4K using SR with NLSP.

2 FOCUS AND EDGE

A ﬁne focus produces a crisp video frame. The differ-

ence between a crisp frame and a blurry frame is ob-

tained by determining how ﬁne the focus is. A crisp

frame has high-frequency elements; a blurry frame

does not. Therefore, the amount of high-frequencyel-

ements in a frame can be a barometer for the state of

the focus. High-frequency elements are the edges in

a frame. A focused frame has more edges than an un-

focused frame. Edges in a frame appear when a cam-

era shoots a scene that has detailed textures. When

the focus is turned on, the maximum amount of edges

appears. This type of focus aid system has been de-

veloped (Funatsu et al., 2013). However, it is difﬁcult

to judge whether the focus is obtained on the basis of

the edges detected with the conventional HPF method

because of low level edges and noise.

Figures 1, 2, and 3 provide an example. Figure 1

is the original video frame; Figures 2 and 3 show the

frame with the absolute value of the edges superim-

posed on it. The luminance level is modiﬁed to check

the edges clearly, and HPF is used to detect the edges.

Figure 2 shows an out-of-focus result, and Figure 3

shows a ﬁne focus result. Although the edges in Fig-

ure 2 are stronger than those in Figure 3, both look

similar; it is difﬁcult to see the difference. Moreover,

noise is visible in both, which makes it more difﬁcult

to adjust the focus with the edges. This is the limita-

tion of using conventional edge detection technology:

noise in a frame has high-frequency elements just like

the edges. Thus, even if the characteristics of HPF

change, the result will be similar because of the pres-

ence of noise.

Figure 1: Original image.

Figure 2: Out of focus image.

Figure 3: Fine focus image.

3 NLSP

The only practical method for improving resolution in

real time has been Enhancer (unsharp mask), which is

discussed here to clarify the real-time edge detection

method (Schreiber, 1970) (Lee, 1980) (Pratt, 2001).

Figure 4 shows the block diagram of a typical en-

hancer. It uses HPF to detect edges, which are then

processed with a limiter (LMT) until achieving ap-

propriate levels for the image. The edges are added

to the image with the adder block (ADD). Enhancer

merely ampliﬁes the high-frequency elements in an

image, which means that noise is also ampliﬁed and

added to the image; this is exactly what happens in

Figures 2 and 3. Edges and noise are both ampli-

ﬁed and added. This issue will be discussed again in

the section 4. Noise always appears in an image ex-

cept for sufﬁcient lighting conditions such as in sunny

places in a ﬁne day. The current technology (Funatsu

et al., 2013) can be usable only in good lighting lim-

ited conditions. A new technology is necessary to dis-

tinguish valuable edges from noise.

NLSP has been recently proposed to improve the

resolution of videos in real time. NLSP can distin-

Focus-aid Signal for Ultra High Deﬁnition Cameras

177

Figure 4: Enhancer.

Figure 5: NLSP.

guish edges from noise. Figure 5 shows the sig-

nal ﬂow of NLSP. The upper path comprises an

HPF, a nonlinear function (NLF), and the LMT. The

edges in the video are detected with the HPF and

then processed with the NLF, which generates har-

monic waves from the edges. These harmonic waves

have higher-frequency elements than what the origi-

nal video has and are generated only from the edges

detected with the HPF. There are no harmonic waves

in ﬂat areas, because there are no edges in ﬂat ar-

eas. An example of an NLF is a cubic function. The

range of the input of the NLF is from 255 to 255 if

the depth of the video is 8 bits. The output of the

NLF becomes very large, because the cubic function

generates the pixels from 255

to 255

. The LMT

saturates these large values to ﬁt the harmonic waves

to the video. The lower path is from the input and

is directly connected to the ADD. The ADD adds the

LMT-processed harmonic waves to the original video.

Thus, the output of the ADD has high-frequency el-

ements that the original video does not have. This

video processing method can improve resolution and

create high-frequency elements that exceed even the

Nyquist frequency.

Figure 6 shows a simulation result of using NLSP.

Figure 6(a) is a still image that has been enlarged hor-

izontally and vertically by a factor of two relative to

the original. Figure 6(c) shows the two-dimensional

fast Fourier transform (FFT) of (a). It does not have

high-frequencyelements, because it is an enlargement

image. Figure 6(b) shows the output of the image af-

ter it has been processed with NLSP according to the

ﬂow shown in Figure 5. Figure 6(d) shows the two-

dimensional FFT of Figure 6(b). Comparing Figure

6(d) with Figure 6(c), we can see that (d) has high-

frequency elements that (c) does not have. Because

Figure 6(a) is horizontally and vertically enlarged by

a factor of two, the horizontal Nyquist frequency of

the original image is /2 in Figure 6(c). There are no

spectra beyond /2 in Figure 6(c), because the enlarged

image has the same spectra as the original image. In

contrast, the spectra in Figure 6(d) have elements that

Figure 6(c) does not have; it exceeds the Nyquist fre-

quency (/2) of the original image. This proves that

the proposed method can create higher frequency ele-

ments than the Nyquist frequency of the original im-

age. It is also important to note that Figure 6(b) does

not have visible noise. Noise in NLSP is discussed in

the next section.

4 EDGE DETECTION

Figure 7(a) shows the graphical image of edges cre-

ated by conventional HPF and Figure 7(b) shows the

edges created by NLSP. The edges created by NLSP

are stronger and sharper than those created using con-

ventional HPF. Moreover, the edges are also thin-

ner and higher because harmonic high-frequency ele-

ments with NLF are added to the original edges. The

edges created with NLSP show the focus point with

strong edges, which help in focus adjustment.

Here we discuss the edge detection method with

conventional HPF (Funatsu et al., 2013) and with

NLSP. In generally, noise is smaller than the edges

in images and videos. Figures 8(a) shows an exam-

ple of edges detected from video by the conventional

method and Figure 8(b) shows an example of edges

detected from video by NLSP. A threshold level is

selected to detect the true edges in a video from the

edges by noise. However,in Figure 8(a) the true edges

detected by HPF are not sufﬁciently larger than the

edges with noise. The edges detected by conventional

HPF have levels similar to the noise. If we deﬁne a

threshold level to discriminate the edges in an image

from noise, the allowance for the level is narrow, as

shown in Figure 8(a). The detected edges are very

small and it is very difﬁcult to separate the true edges

in the image from those detected by noise. If we de-

tect the true edges in the video, we have to lower the

threshold level.

In this case as shown in Figure 8(b) the appropri-

ate threshold level does not exist and edges cannot be

separated from the edge. The edges caused with noise

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

178

Figure 6: Image processed NLSP.

Figure 7: Created edges.

are also detected in video and it is impossible to adjust

the focus. This is exactly what happened in Figures 2

and 3. Conversely if we higher the threshold level to

remove the edges caused by noise, the true edges in

video are also removed.

Conversely, the edges detected by NLSP are larger

than those detected with HPF and noise can be sup-

pressed by the nonlinear function shown in Figure

9(a). Edges in the image are ampliﬁed by NLSP and

it becomes the high level edge(HLE) in Figure 9(a).

Conversely edges detected by noise (EN) are small.

NLSP makes the edge level differences bigger be-

tween HLE and EN. As shown in Figure 9, the levels

of HLE and EN can be separated with an appropri-

Figure 8: Threshold process for HPF.

ate threshold level. In Figures 8 it is very difﬁcult to

ﬁnd the threshold level to separate the true edges from

noise because their level are similar. In Figure 9 it is

easy to select the threshold level than that in Figure 8.

By controlling the threshold level, noise can be sup-

pressed so that only edges are detected, as shown in

Focus-aid Signal for Ultra High Deﬁnition Cameras

179

Figure 9: Threshold process for NLSP.

Figure 9(b). The edge shown in Figure 9(b) is thin-

ner and larger than that shown in Figure 8(b). The

edges ampliﬁed by the nonlinear function are much

more visible than are the edges detected by conven-

tional HPF, even though the characteristics are deeply

deliberated. The edges detected by NLSP are sufﬁ-

ciently large to adjust the focus.

When we decide the parameters of enhancers,

noise is always problem. If we try to amplify the

edges very strong, noise becomes visible in ﬂat areas

in the image; Figure 10 is an example of such noise.

Compared Figure 10 with Figure 6(b), noise in Figure

10 is visible in the ﬂat areas such as in the forehead

and cheeks. This happens in Figures 2 and 3. NLSP

creates harmonics to make edges thinner and higher.

Conversely although noise is also processed by NLSP,

the energy of noise is small. Turning the parameters

of NLSP, it is possible for NLSP not to make small

edges unnecessarily higher; Figure 6(b) is an exam-

ple. Noise is not visible in the forehead and cheeks.

Real-time functionality is a fundamental require-

ment for video equipment, and adjusting the lens fo-

cus is no different: it must be able to be done in real

time. Furthermore, the focus adjusting system should

be installed in a small device. If the system is in

a bulky device, it will be impractical for daily use.

The proposed NLSP algorithm meets both of these re-

quirements because NLSP has been successfully im-

plemented in an ﬁeld programmable gate array and

has worked as SR equipment for both 4K and 8K.

5 EXPERIMENT WITH

REAL-TIME HARDWARE

The real-time hardware shown in Figure 11 was de-

Figure 10: Enhancer processed image.

Figure 11: Real time hardware.

veloped to prove the validity and practicality of the

proposed method. Stuffed animals are set and shot

with a 4K camera. The 4K camera is connected to the

hardware and an LCD shows the stuffed animals and

the edges. The LCD displays a stuffed bear with white

dots appearing only on its face and hat. These dots are

the image’s edges and appear at very limited distances

from the 4K camera; this means that the focus is ad-

justed on the bear’s face. These white dots appear

only on focused areas and can assist in adjusting the

focus. Although the LCD is 9 inches, the edges are

sufﬁciently visible. A cameraman adjusts the focus

to maximize the edges he wants to focus on. Bulky,

55-inch monitors are not necessary to adjust the focus

for 4K/8K.

As discussed in Section III, NLSP is valuable to

use against noise; the conventional HPF edge de-

tection method does not have such noise tolerance.

Noise is especially a problem for dark, low-luminance

scenes. Here we discuss this issue with the image

shown in Figure 12. Figure 12 is shot with a 4K cam-

era and Figures 13 and 14 were the same ones taken

in a dark room. They show the difference between

the NLSP and conventional edge detection methods

in low-luminance conditions. In these ﬁgures, boxes

with ribbons are set in the frame and taken with a 4K

camera. A white rectangle in Figure 13 is drawn to in-

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

180

Figure 12: Original image.

Figure 13: Edge with the proposed method.

Figure 14: Edge with the conventional method.

dicate the area where the white dots exist; they appear

within the rectangle only at limited distances from the

4K camera, which means that the 4K camera focused

on the rectangle area. Figure 14 is shot with the con-

ventional edge detection method. In that ﬁgure, the

white dots appear all over the image because of noise

and strong edges that are less visible than those of

Figure 14. In Figure 13, it is impossible to determine

what the camera focused on.

6 CONCLUSION

A real-time focus adjustment algorithm for 4K, which

will aid human visual systems, is proposed. It pro-

duces edges only in small, on-focus areas, and these

edges are easily detected by human eyes even when

small LCDs are used. It also has high tolerance

against noise under low-light conditions. The system

is suitable for practical use in creating 4K content, un-

like large monitors.

REFERENCES

Farsiu, S., Robinson, M. D., Elad, M., and Milanfar, P.

(2004). Fast and robust multi frame super resolu-

tion. IEEE TRANSACTIONS on Image Processing,

13(10):1327–1344.

Freeman, W. T. and Liu., C. (2011). Markov random ﬁelds

for super-resolution and texture synthesis. MIT Press,

Advances in Markov Random Fields for Vision and

Image Processing.

Freeman, W. T., Pasztor, E. C., and Carmichael, O. T.

(2000). Learning low-level vision. International Jour-

nal of Computer Vision, 1(40):25–47.

Funatsu, R., Yamashita, Y., Mitani, K., and Nojiri, Y.

(2013). Focus-aid signal for super hi-vision cameras.

Technical Report 53.

Katsaggelos, A., Molina, R., and Mateos, J. (2010). Super

Resolution of Images and Video: Synthesis Lectures

on Images, Video and Multimedia Processing. Morgan

and Clayppo Publishers, La Vergne TN USA.

Lee, J. S. (March 1980). Ieee trans. on pattern analysis and

machine intelligence 2:165-168. Digital Image En-

hancement and Noise Filtering by Use of Local Statis-

tics.

Park, S. C., Park, M. K., and Kang, M. G. (2003).

Super-resolution image reconstruction: A technical

overview. IEEE Signal Processing Magazine, 1053-

5888/03:21–36.

Pratt, W. K. (2001). Digital Image Processing (3rd Ed):

New York. John Wiley and Sons.

Schreiber, W. F. (1970). Wirephoto quality improvement by

unsharp masking. J. Pattern Recognition, 2:111-121.

van Eekeren, A. W. M., Schutte, K., and van Vliet, L. J.

(2010). Multiframe super-resolution reconstruction

of small moving objects. IEEE TRANSACTIONS ON

IMAGE PROCESSING, 19(11):2901–2912.

Zhu, Z., Guo, F., Yu, H., and Chen, C. (2014). Fast sin-

gle image super-resolution via self-example learning

and sparse representation. Multimedia, IEEE Trans-

actions, 16(8).

Focus-aid Signal for Ultra High Deﬁnition Cameras

181