3D Thermal Monitoring and Measurement using Smart-phone
and IR Thermal Sensor
Arindam Saha, Keshaw Dewangan and Ranjan Dasgupta
Innovation Lab, Tata Consultancy Services, Kolkata, India
Keywords: Thermal Inspection, Smart-phone Application, Sensor Fusion, 3D Measurement, Camera Calibration,
Multi-view Geometry, FLIR Thermal Attachment.
Abstract: Continuous and on the fly heat monitoring in industries like manufacturing and chemical is of compelling
research nowadays. The recent advancement in IR thermal sensors unfold the possibilities to fuse the
thermal information with other low cost sensor (like optical camera) to perform area or volumetric heat
measurement of any heated object. Recent development of affordable handheld mobile thermal sensor as a
smart-phone attachment by FLIR encouraged the researcher to develop thermal monitoring system as smart-
phone application. In pursuit of this goal we present a light weight system with a combination of optical and
thermal sensors to create a thermal dense 3D model along with area/volume measurement of the heated
zones using smart-phone. Our proposed pipeline captures RGB and thermal images simultaneously using
FLIR thermal attachment. Estimates the poses for RGB and depth images, 3D models are generated by
tracking the features from RGB images. Back-projection is used to colour the 3D points to represent both in
RGB as well as an estimated surface temperature. The final output of the system is the detected hot region
with area/volumetric measurement. Experimental results demonstrate that the cost effective system is
capable to measure hot areas accurately and usable in everyday life.
1 INTRODUCTION
Unobtrusive heat measurement and monitoring is
well accepted in manufacturing, chemical,
automobile, construction industries. Conventional
industrial thermal cameras are still not in affordable
range for everyday life usage. Conventional
thermography for energy measurement and non-
invasive assessments relies on 2D thermal images,
which have significant limitations like lack of
information on the shape and geometry or location
of the object of interest in the scene. So there is
growing interest on representing the environment in
3D which also integrates the temperature
information. The combined information will help to
detect the object of interest and volumetric
measurement precisely. Autonomous solution is in
high demand in the market, especially in industries
like manufacturing and chemical and also systems
which are usable in everyday life. FLIR lunched an
affordable thermal sensor (FLIR, 2014) as smart-
phone attachment which manifolds the possibility of
monitoring and verification of heated region using
such hand held mobile low cost sensors.
We present a cost effective 3D thermal mapping
system in contrast of conventional thermal camera
without compromising much of qualitative measure.
The system is capable of area or volumetric
measurement of heat in a continuous and non-
invasive way. The proposed system is consists of a
hand held smart-phone and a FLIR thermal
attachment. FLIR thermal attachment for smart-
phone is features enrich product within affordable
price compare to costly conventional thermal
sensors. FLIR thermal attachment comes with
160x120 thermal resolutions which are further
scaled up to VGA resolution using FLIR MSX
technology (FLIR, 2014) and it is capable to detect
temperatures between -20°C to 120°C with a
resolution of 0.1°C. Conventional thermal sensors
are more on to measure the heat accurately in a 2D
space. The proposed system is capable of generating
dense 3D model with thermal annotation for the
purpose of further processing and measurement
volumetric heat on the fly for everyday usage. The
system can be consider as a trade-off with more
accurate thermal camera where volumetric heat
measurement is more important compare to the
696
Saha, A., Dewangan, K. and Dasgupta, R.
3D Thermal Monitoring and Measurement using Smart-phone and IR Thermal Sensor.
DOI: 10.5220/0005786106940700
In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 3: VISAPP, pages 696-702
ISBN: 978-989-758-175-5
Copyright
c
2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
accuracy of thermal measurement for example
sludge-heel formation inside in an oil tank. The
proposed system is sub set of much bigger concept
presented by P. Deshpande et al., (2015).
The major contributions are:
Designing a hand-held light weight system for
continuous monitoring and measurement of
heated zone for everyday usage which can be
extend for various industries like oil refinery,
automobile industries etc.
Creating smart-phone application for thermal
measurement using FLIR thermal attachment.
Finding out the heated regions automatically and
measurement of heat area or volume accurately
in 3 dimensions.
Our entire framework exploits several state-of-the-
art algorithms for generating dense 3D environment
using IMU sensors. We present experimental results,
which prove that proposed system can be utilized in
wide range of scenarios. We also evaluate the
accuracy of the reconstruction by comparing with
the ground truth.
2 STATE OF THE ART
Several studies are performed to explore the
potential of 3D thermal mapping and volumetric
inspection. The studies are mostly focused on
monitoring building power consumption.
ThermalMapper (Borrmann et al., 2012) is a
well-known project which uses a terrestrial laser
scanner and thermal infrared camera on a wheel
robot. The thermal data is kept on projecting onto
the 3D model as soon as it is generated by laser
scanner. The final result from ThermalMapper is a
dense 3D point cloud which can be visualized in
both RGB and thermal. Volumetric heat
measurement and analysis is not part of the
presented system. There is significant cost and
mobility difference between the presented systems
with our proposed system due to the usage of a light
weight (approximately 78 grams) low cost FLIR
attachment with smart-phone.
In a recent work Vidas et al., (2013) represent a
3D thermal mapping to monitor building interiors
using Microsoft Kinect (Microsoft, 2010) and a
thermal camera. In computer vision and robotics, the
use of RGBD cameras like Microsoft Kinect
facilitates the development of techniques for highly-
detailed and spatially-extended reconstructions
(Meilland and Comport, 2013; Whelan et al., 2013).
Such costly and bulky coupled sensors are capable
of reconstructing in real-time (Newcombe et al.,
2011), but the use of structured light pattern make
the product usage limited within indoor environment
and short range measurements. The presented
system is limited to present 3 dimensional
environments with surface temperature annotation
and automatic heat measurement and analysis is out
of scope. The dimension, weight and operating
environment are the main drawbacks for kinect to be
used as a hand-held low cost system. Though active
depth sensors have many advantages, there are
certain scenario where passive RGB cameras are
preferred due to its low power consumption, outdoor
capable and form factor. This has motivated many
researchers to investigate methods for 3D
reconstruction using only passive cameras. So, the
stereo approaches are still very popular. There are
approaches where binocular vision is used for 3D
reconstruction in indoor environment for navigation
(Krishnan and Kollipara, 2014). The growing
interest of dense reconstruction gave attention in
multi-view stereo technique (Seitz et al., 2006;
Furukawa et al., 2010) where the computational
complexity prevents them to be used as low cost and
light weight system.
In a recent work (Pradeep et al., 2013) has
described a methodology for marker less tracking
and 3D reconstruction in scenes of smaller size
using RGB camera sensor. It tracks and re-localizes
the camera pose and allows for high quality 3D
model reconstruction using a webcam. (Pizzoli et al.,
2014) proposed a solution by adapting a
probabilistic approach in which depth map is
computed by combining Bayesian estimation and
convex optimization techniques. These
implementations are limited to a small scene
reconstruction. These kind of system paired with
another thermal camera would generate a clumsy
setup and mobility of the entire system would be
restricted.
In another work Saha et al., (2014) has presented
a system where smart-phone is used as capture
device and the entire reconstruction is performed in
a backend system. The mobility of the system is
main drawback for everyday usage.
Industrial thermal cameras are capable of
measuring the temperature accurately from a
specified distance and few costly cameras provide
dimension of the heated regions in 2 dimensions
with a user guided way. FLIR smart-phone thermal
attachment is also providing information in 2
dimensional spaces. Volumetric measurements are
limited due to 2 dimensions. The cost of thermal
cameras is another metric which restrict these
3D Thermal Monitoring and Measurement using Smart-phone and IR Thermal Sensor
697
products to be used only in industrial segment. FLIR
smart-phone thermal attachment brings the
opportunity to be used as house hold product for
everyday life usage due to the enormous cost
reduction, increase mobility for small dimensions
and weight and finally user friendly instead of
expensive or bulky thermal systems. Automatic
Volumetric measurement requires the heat analysis
on 3 dimensions, so there are limitations in state of
the art for an autonomous affordable system which
is capable of area or volumetric measurement of any
heated regions in everyday life. We present a smart-
phone based framework along with an implemented
application for the gap as discussed.
3 FRAMEWORK DESCRIPTION
The block diagram of the entire system is illustrated
in figure 1.
3.1 Data Acquisition
We used hand held smart-phone with FLIR thermal
attachment for entire data collection and processing.
Data collection starts from certain position
(according to requirement and camera field of view)
and we take this as origin of world coordinate
system for entire data collection and calculation. To
collect data, we use our two apps in parallel. One
app handles the entire image capturing called
OptoThermalFLIR App and another app called i-
POSE handles the task of recording the IMU data
with timestamp.
OptoThermalFLIR app captures optical and
thermal image simultaneously at each point and
saves with timestamp. i-POSE app runs in
background and records the IMU data
(accelerometer, magnetometer and gyroscope are
used) with timestamp at 200Hz. The timestamp
information helps us to map image frame with
position from where image has taken. The auto
capture of images is controlled using accelerometer
sensor. The noisy accelerometer data alone is
capable enough to determine the motion status of the
phone coarsely. The image capture is triggered only
if the smart-phone is detected in stationery condition
to avoid motion blur in images. Captured images and
pose information are further used for 3D thermal
mapping.
3.2 Camera Calibration
The cameras mounted on the FLIR thermal
attachment are used in our experiments. Pin-hole
camera model (Hartley et al., 2003) is used. The
internal calibration process is performed offline
using well known checker board methods as
described by Zhang (Zhang, 2000).
External calibration are derived from inertial and
IMU sensors as described by Bhowmick et al.,
(2014).
Optical and thermal cameras are apart with fixed
distance and they are parallel. So there is a fixed
translation between the cameras without any
rotation. So the external calibration for the thermal
image is derived by adding the fixed translation
vector with the calibration matrix of optical camera.
Figure 1: System Workflow.
3.3 Dense Correspondence Estimation
The dense stereo matching is vast and we refer to H.
Hirschmuller (Hirschmuller and Scharstein, 2009)
for a comparison of all existing methods. In fact,
there are few relevant works available on real-time,
Simultaneous RGB &
Thermal data capture
3D reconstruction
using RGB images
Superimpose thermal
information using
back projection
Generate camera
calibration matrix
using IMU sensors
Segment the heated region
from thermal 3D clou
d
Generate the contour of
all heated regions
Calculate the dimensions
from all heated regions
contours
VISAPP 2016 - International Conference on Computer Vision Theory and Applications
698
dense reconstruction using a monocular moving
camera.
Motion estimation by means of optical flow is a
well-accepted and established methodology for
providing dense sampling in time. The predominant
way of estimating dense optical flow in today’s
computer vision literature is by an approach of
integrating rich descriptors into the variational
optical flow setting as described in T. Brox (Brox
and Malik, 2011). The main advantage of the
selected approach is the ability to produce better
results in a wide range of cases and also for large
displacement.
Large displacement optical flow is a variational
optimization technique which integrates discrete
point matches with continuous energy formulation.
The final goal is to find the global minima of the
energy and for that the initial guess of the solution
has to very close to the global minima. The entire
energy is globally minimized and the details of
minimization procedure are studied in T. Brox (Brox
and Malik, 2011).
The n-view point correspondence generation is
carried out using the GPU implementation as
described by N. Sundaram (Sundaram et al., 2010).
The point trajectories are generated between
consecutive frames from the captured images.
Optical flow has an effect of accumulating errors in
the flow vector. So, a long trajectory suffers from
this error and leading to a significant drift. Short
trajectories are almost free from the drift error, but
the triangulation process suffers due to small base
line measurements. Hence, we have chosen
consecutive frames that are having base line more
than 5 cm and the trajectory length chosen as less
than 8.
3.4 Outlier Estimation
Detecting outlier is a very primitive task before
doing any further processing with the available
information. Outlier detection process is very
straight forward and it follows the epipolar
constraints (Hartley et al., 2003) as shown in (4).
The main advantage of having accurate camera
calibration parameters helps us to generate an
accurate fundamental matrix using equation (1)
(Hartley et al., 2003).
FK


R
K

(1)
Where F is the fundamental matrix between an
image pair, K and K′ correspond to the internal
calibration matrix of the image pair, R and t
represent the rotation matrix and translation vector
of second image with respect to first image.
Rotation matrix R

between camera pair i and j
can be obtained using equation (2) (Hartley et al.,
2003) where R
and R
denotes the rotation matrix
of camera i and j respectively with respect to the
global coordinate system.
R

R
R

(2)
The translation vector t

between a camera pair i
and j is calculated using equation (3) where C
and C
denotes the absolute position for camera i and j
respectively.

R
C
C
(3)
The corresponding points x, x
ideally should
follow the epipolar constraints as given in equation
(4) (Hartley et al., 2003) where F denotes the
Fundamental matrix. In reality, the value never
becomes zero rather it goes very close to zero. So,
any corresponding points in order to be considered
as inlier, the value should be below to a threshold.
Now, a particular threshold is not suitable for all
cases, so there is a requirement of defining a
dynamic threshold which can automatically be
adjusted depending upon the captured scene. The
threshold value is defined as dynamic and it gets
calculated based on percentage of rejection.
x′
Fx 0
(4)
Implemented optical flow algorithm is more error
prone towards the image boundaries. Centre of the
image is given higher weightage due to the sated
reason and pixels are placed near to the image
boundaries are removed as outlier.
3.5 3D Model Reconstruction
The 3D model generation is done in form of point
cloud. The 3D point cloud is created using
triangulation process. The approaches are quite
similar as described by Dewangan et al., (2015).
Each point is back projected onto the image plane to
calculate the back projection error. Any 3D point
with back projection error more than 2 pixels is
considered as outlier. The camera calibration
parameters from IMU sensors and dense point
correspondence from optical flow estimation
generates an accurate point cloud which does not
require any further optimization.
The whole scene reconstruction is done in an
incremental way. Images are divided into small sets
as mentioned above such that the trajectory length is
not more than 8 images. Each subset is merged after
triangulation to get the final reconstruction.
3D Thermal Monitoring and Measurement using Smart-phone and IR Thermal Sensor
699
3.6 Opto-thermal Mapping
Normally thermal cameras are required to calibrate
the intensity in regular interval for correction for a
gradual decrease in measured signal accuracy during
operation. These operations are known as NUCs
(Non-Uniformity Corrections). FLIR thermal
attachment is of similar type and required regular
calibration.
Figure 2: Temperature assignment in 3D model.
One of the great advantages of using FLIR is the
placement of optical and thermal camera; both the
sensors are very close and reproduce almost similar
views. If any 3D point is visible in a RGB image
almost all cases it is also visible in the corresponding
thermal image. We measure the displacement
between these two sensors and incorporate the
measurement into calibration. The thermal
annotation of every point is almost error free due to
very short base line between optical and thermal
camera.
Thermal information is assigned on the optical
3D point cloud by applying back projection. Each
3D point from the point cloud is back projected
using equation (5) (Hartley et al., 2003) to the
corresponding 2D thermal image plane to determine
the coordinate on the 2D thermal image plane and
determines the temperature from 2D coordinate
location of thermal image. The entire process is
explained in figure 2.
x
K
RX 
(5)
Where X and x represents the 3D and 2D point
respectively.
3.7 Heat Measurement
Thermal 3D point cloud is the representation of
thermal profile in 3D space. The visualization is
only the annotation of temperature in form of colour
as shown in figure 3. The temperature values are the
main differentiator in the entire 3D model. Heated
regions are segmented using the temperature profile.
All segmented point cloud are stored and analysed
separately. Point cloud library (PCL) is an open
source tool for cloud processing and this is been
extensively used in our implementation. Contour of
each segmented heat cloud is determined and the
area or volume is calculated from the contour.
4 RESULTS
4.1 Test Environment
Our test environment is iPhone 5s with FLIR
thermal attachment. The entire framework is
implemented and tested as an application.
The accuracy of the proposed system is entirely
depends on reconstruction accuracy which is intern
dependent on the output of the optical flow. We
have incorporated several precaution procedures to
make the system robust enough for daily usage. The
accuracy of the flow vector is less than a pixel and
so the estimated error is in few millimetre ranges for
bigger object like car as shown in the example of
figure 4. The presented system is probably the first
such mobile application as per our knowledge which
is capable of measuring area or volume of heat
automatically, so unable to benchmark with similar
application. Costly thermal cameras are used in
industries which are capable of measuring heated
area or volume with greater accuracy.
We have tested the application in normal lighting
condition in both indoor and outdoor situations and
found satisfactory result.
4.2 Outputs
We present two sample heat measurements in
different environment to demonstrate the usability.
Figure 3 shows a sample 3D thermal point cloud
of a mug containing hot water. The presence of
water is only detectable through thermal image. The
testing is performed using 11 images with two
iterations. The 3D thermal cloud shows the structure
of the mug along with the hot region. The idea is to
measure the volume of hot water present inside. The
mug is segmented by the knowledge of its
VISAPP 2016 - International Conference on Computer Vision Theory and Applications
700
cylindrical shape. The segmentation finally is used
to detect the dimension of the mug. Mostly heated
region is extracted from the temperature and the
dimension is calculated from 3D structure.
Figure 3: Volumetric Measurement: Top to bottom shows
the captured RGB and thermal image, segmented 3D
thermal model, detected dimension of the heated region
with maximum temperature.
Figure 4 represents thermal profile of a car just
after parking in an indoor parking lot. The
registration number plates on the images are
corrupted intentionally. Thermal profile on the
images shows the tires and engine inside the bonnet
is the hottest zones as expected. We have measured
the surface areas of the heated zones. Any
abnormality found in the area of the heated region
would eventually help automobile companies to
predict or understand the fault in the engine or any
other part of a car.
The presented samples show the capabilities to
perform a heated area or volumetric analysis in a
non-invasive way. Heat measurement is typically
depends on the heat flow inside the container and
type of material used for the container. We observe
the similar characteristics through our experimental
experiences. Thermo-flax is typically well known
for locking the heat inside the container for longer
time so the temperature gradient is not prominent or
distinct if we perform experiments with hot water
inside of a thermo-flax.
Figure 4: Area Measurement: Top to bottom: RGB and
thermal image, detected hot zones with covered area and
maximum temperature in each heated region on the
thermal 3D model.
4.3 Execution Time
Execution time is highly dependent on 3D
reconstruction time and number of images that are
used for testing. We could able to process a single
image roughly between 2 to 2.5 seconds. The
execution time for the samples presented in figure 3
and 4 are 23 seconds and 18 seconds respectively.
5 CONCLUSIONS
We presented an approach for dense 3D thermal
mapping for heat monitoring along with
area/volumetric heat measurement using smart-
phone and FLIR thermal attachment. 3D
reconstruction is performed with RGB image and
thermal overlapping on 3D model is done to create a
3D thermal structure. Heated regions are segmented
and structure of heated regions is analysed to
3D Thermal Monitoring and Measurement using Smart-phone and IR Thermal Sensor
701
calculate the contours. Area / volume of the heated
regions are calculated from the corresponding
contours. Our results show the capability of such
solution which can be applied in other domain for
any specific purpose. The main advantage of such a
system is that, it uses only passive sensors for
measurement, so it can be deployable in outdoor
environment. We also analysed computation time
and this shows the solution runs in near real-time.
The heated area or volume measurements with
closed container show different heat profile. The
heat flow also has a great effect on heat profile.
These types of works are considered as further
improvement of the entire system.
REFERENCES
http://www.flir.com.hk/flirone/
P. Deshpande, V. R. Reddy, A. Saha, K. Vaiapury, K.
Dewangan, R. Dasgupta, (2015). A Next Generation
Mobile Robot with Multi-mode Sense of 3D
Perception. IEEE International Conference on
Advanced Robotics (ICAR), pp. 382 - 387
DOI:10.1109/ICAR.2015.7251484
D. Borrmann, A. Nuchter, M. Dakulovic, I. Maurovic, I.
Petrovic, D. Osmankovic, and J. Velagic, (2012). The
project ThermalMapper thermal 3D mapping of indoor
environments for saving energy. In Proc. of the 10th
International IFAC Symposium on Robot Control
(SYROCO), vol. 10.
S. Vidas, P. Moghadam and M. Bosse, (2013). 3D thermal
mapping of building interiors using an RGB-D and
thermal camera. In Proc. of IEEE International
Conference on Robotics and Automation.
http://www.microsoft.com/en-us/kinectforwindows/
M. Meilland and A. Comport, (2013). Super-resolution 3D
Tracking and Mapping. In IEEE Intl. Conf. on
Robotics and Automation.
T. Whelan, H. Johannsson, M. Kaess, J. Leonard, and M.
J.B., (2013). Robust real-time visual odometry for
dense RGB-D mapping. In IEEE Intl. Conf. on
Robotics and Automation (ICRA).
R. A. Newcombe, A. J. Davison, S. Izadi, P. Kohli, O.
Hilliges, J. Shotton, D. Molyneaux, S. Hodges, D.
Kim, and A. Fitzgibbon, (2011). KinectFusion: Real-
time dense surface mapping and tracking. In Proc.
ISMAR.
A. B. Krishnan and J. Kollipara, (2014). Intelligent indoor
mobile robot navigation using stereo vision. Signal &
Image Processing: An International Journal (SIPIJ),
Vol.5, No.4.
A. Saha, B. Bhowmick and A. Sinha, (2014). A system for
near real-time 3D reconstruction from multi-view
using 4G enabled mobile. IEEE International
Conference on Mobile Services (MS), pp. 1-7,
doi:10.1109/MobServ.2014.10.
S. Seitz, B. Curless, J. Diebel, D. Scharstein, and R.
Szeliski, (2006). A comparison and evaluation of
multi-view stereo reconstruction algorithms. In Proc.
IEEE Conf. on Computer Vision and Pattern
Recognition.
Y. Furukawa and J. Ponce, (2010). Accurate, dense, and
robust multiview stereopsis. In Proc. IEEE Trans.
Pattern Anal. Machine Intell., vol. 32, no. 8, pp. 1362–
1376.
V. Pradeep, C. Rhemann, S. Izadi, C. Zach, M. Bleyer and
S. Bathiche, (2013). MonoFusion: Real-time 3D
reconstruction of small scenes with a single web
camera. The 13th IEEE International Symposium on
Mixed and Augmented Reality.
M. Pizzoli, C. Forster and D. Scaramuzza, (2003).
REMODE: Probabilistic, monocular dense
reconstruction in real time. In Proc. IEEE International
Conference on Robotics and Automation (ICRA).
R. Hartley and A. Zisserman, (2003). Multiple view
geometry in computer vision. ISBN 0-521-54051-8,
Cambridge University Press.
Z. Zhang., (2000). A flexible new technique for camera
calibration. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 22(11):1330-1334.
B. Bhowmick, A. Mallik and A. Saha, (2014).
Mobiscan3D: A low cost framework for real time
dense 3d reconstruction on mobile devices. Iin Proc. of
The 11th IEEE International Conference on
Ubiquitous Intelligence and Computing.
H. Hirschmuller and D. Scharstein, (2009). Evaluation of
stereo matching costs on images with radiometric
differences. IEEE Trans. Pattern Anal. Machine Intell.,
vol. 31, no. 9.
T. Brox and J. Malik, (2011). Large displacement optical
flow: descriptor matching in variational motion
estimation. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 33(3):500-513.
N. Sundaram, T. Brox and K. Keutzer, (2010). Dense
point trajectories by GPU-accelerated large
displacement optical flow. European Conference on
Computer Vision (ECCV), Crete, Greece, Springer,
LNCS.
K. Dewangan, A. Saha, K. Vaiapury, R. Dasgupta, (2015).
3D Environment Reconstruction using Mobile Robot
Platform & Monocular Vision. 9th International
Conference on Advanced Computing &
Communication Technologies (ICACCT).
VISAPP 2016 - International Conference on Computer Vision Theory and Applications
702