Crack Identification Via User Feedback, Convolutional Neural Networks
and Laser Scanners for Tunnel Infrastructures
Eftychios Protopapadakis
1
, Konstantinos Makantasis
1
, George Kopsiaftis
2
, Nikolaos Doulamis
2
and Angelos Amditis
3
1
Technical University of Crete, Kounoupidiana, 73100, Chania, Greece
2
National Technical University of Athens, 9 Iroon Polytechneiou, 15780, Athens, Greece
3
Institute of Communication and Computer Systems(ICCS), Zografou 157 80, Athens, Greece
Keywords:
Convolutional Neural Networks, User Feedback, Crack Detection, Laser Scanners.
Abstract:
In this paper, a deep learning approach synergetically to a laser scanning process are employed for the visual
detection and accurate description of concrete defects in tunnels. Analysis is performed over raw RGB im-
ages; Convolutional Neural Network serves as the crack detector, during the inspection. In case of a positive
detection, the tunnel’s cross-section morphology is assessed via 3D point clouds, created by a laser scanner,
allowing the identification of deformations in the compartment. The proposed approach, in contrast to the
existing ones, emphasizes on applicability (easy initialization, no preprocessing of the input data) and pro-
vides a holistic assessment of the structure; reconstructed 3D model allows the fast identification of structural
divergence from the original design, alerting the engineers for possible dangers.
1 INTRODUCTION
Nowadays, tunnel inspection, for structural evalua-
tion, is mainly performed through tunnel-wide visual
observations by inspectors; a human has to identify
structural defects, rate them and then, based on their
severity, categorize the liner. Generally, the empirical
evaluation can be incomplete mainly due to fatigue,
experience, adverse working conditions, or other rea-
sons; it is therefore, unreliable.
Automated approaches for the visual inspection
(VI) could deal with most of the aforementioned is-
sues, assuming that they have adequate detection abil-
ities. These methods exploit, mainly, image process-
ing and machine learning techniques. Automated ap-
proaches have been applied in various cases including
roads, bridges, fatigues, and sewer-pipes (Pynn et al.,
1999; Kim and Haas, 2000; Tung et al., 2002; Sinha
and Fieguth, 2006).
Current research, in robotics and relevant sec-
tors, presents a variety of sophisticated and reliable
components, needed in automated systems. Such
components can perform quick and robust inspec-
tion/assessment, in general transportation and tun-
nel infrastructures. However, what is still missing is
a holistic approach integrating all these components
into a system.
Related work on VI can be divided in two cat-
egories: (a) The conventional paradigm and (b)
the deep learning approach. In the conventional
paradigm, we have to construct complex handcrafted
features and, then, train the classifier(s). However,
there is a great variety in defect types, making difficult
the feature construction/selection task (Halfawy and
Hengmeechai, 2014). Deep learning models (Hinton
and Salakhutdinov, 2006; Hinton et al., 2006) are a
class of machines that can learn a hierarchy of fea-
tures by building complex, high-level features from
low-level ones, automating the process of feature con-
struction for the problem at hand.
The work presented in this paper, involves a VI
mechanism using both Convolutional Neural Net-
works (CNNs) and laser scanners. Such a computer
vision scheme is easy to integrate with any robotic
part, facilitating the creation of an actual and inde-
pendent robotic inspector. The CNN is applied for
the crack identification, given an RGB image. In case
of defect detection, the laser scanner is activated, pro-
viding with an detailed 3D model of the investigated
cross section.
Protopapadakis, E., Makantasis, K., Kopsiaftis, G., Doulamis, N. and Amditis, A.
Crack Identification Via User Feedback, Convolutional Neural Networks and Laser Scanners for Tunnel Infrastructures.
DOI: 10.5220/0005853007250734
In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 4: VISAPP, pages 725-734
ISBN: 978-989-758-175-5
Copyright
c
2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
725
1.1 Related Work
Intensity features and Support Vector Machines
(SVMs) for crack detections on tunnel surfaces where
used in (Liu et al., 2002). Color properties, differ-
ent non-RGB color spaces and various learning al-
gorithms are also investigated in (Son et al., 2012).
Edge detection techniques are applied in (Abdel-
Qader et al., 2003) for detecting concrete defects.
Edge detection (i.e. Sobel and Laplacian operators)
and graph based search algorithms are also utilized in
(Yu et al., 2007) to extract crack information.
An image mosaic technology for detecting tun-
nels surface defects was further extended in (Mo-
hanty and Wang, 2012). A pothole detection sys-
tem (Koch and Brilakis, 2011), based on histogram
shape-based thresholding and low level texture fea-
tures, has been used in asphalt pavement images.
A concrete spalling measurement system for post-
earthquake safety assessments, using template match-
ing techniques and morphological operations, has
been proposed by (German et al., 2012).
Histograms of Oriented Gradient features and
SVMs are utilized in the work of (Halfawy and Heng-
meechai, 2014), to support automated detection and
classification of pipe defects. Shape-based filtering is
exploited in the work of (Jahanshahi et al., 2013) for
crack detection and quantification. The constructed
features are fed as input to ANN or SVM classifiers in
order to discriminate crack from non-crack patterns.
The work of (Makantasis et al., 2015) exploit a
CNN to hierarchically construct high-level features,
describing the defects, and a Multi-Layer Perceptron
(MLP) that carries out the defect detection task. Such
an approach offers an automated feature extraction,
adaptability to the defect type(s), and has no need for
special set-up for the image acquisition. Nevertheless,
there is a major drawback regarding the applicability
in real life scenarios: resources spend for data anno-
tation. Data annotation is a time consuming job that
requires a human expert; it is therefore prone to seg-
mentation errors.
The approach of (Makantasis et al., 2015)
has been further enriched by (Protopapadakis and
Doulamis, 2015); they incorporated a prior, image
processing, detection mechanism, facilitating the ini-
tialization phase. Such mechanism stands as a sim-
ple detector and is only used at the beginning of the
inspection. Possible defects are annotated and then
validated by an expert; after validating few samples,
the required training dataset for the deep learning ap-
proach has been formed. From this point onwards, a
CNN is trained and, then, utilized for the rest of the
inspection process.
Regarding the laser scanning technology, an in-
creasing interest takes place in the last few years, due
to the very high density of the acquired data. Moni-
toring tunnel deformations is a crucial and recent laser
scanning application, since it is related to the tunnel
stability and safety. A number of studies and papers
concerns tunnel geometry inspection.
The work of (Han et al., 2013) proposed the light
detection and ranging (LiDAR) technique and the
minimum-distance projection (MDP) to collect de-
tailed 3D spatial data in a fast and automatic manner,
while avoiding the 3D to 2D profile projection step to
reduce the associated uncertainties. (Peji
´
c, 2013) pro-
posed a methodology to optimize the scanning param-
eters, scans registration, the georeferencing approach
and the survey control network design.
(Arg
¨
uelles-Fraga et al., 2013) tried to incorporate
several scanning factors, such as tunnel dimensions,
scan density, footprint size, incidence angle and scan-
ner location in scanning circular cross-section tunnels
in order to achieve the pre-determined accuracy spec-
ifications while minimizing the working time. (Nut-
tens et al., 2014) used a laser scanner to observe dif-
ferences of the average radius values during the sta-
bilization phase of a newly built circular train tun-
nel in Belgium. The results of a systematic monitor-
ing could be valuable to the construction engineers
to test and validate the theoretical models. (Monser-
rat and Crosetto, 2008) utilized laser scanner datasets
and least square method to match 3D surfaces.
1.2 Our Contribution
Existing approaches suggest favourably towards CNN
suitability, for VI processes in tunnels. However,
there is no consensus regarding the CNN topology
and the input data format. Additionally, a rational
approach would be the synergy among different VI
approaches in order to mitigate individual drawbacks.
Our work utilizes the same detection mechanism,
as in (Protopapadakis and Doulamis, 2015). How-
ever, the CNN detector is directly utilized over raw
data. Rather than changing the image space, using a
low level feature extraction (Makantasis et al., 2015),
we directly utilize raw image patches. Doing so pre-
serves the adverse effects from wrong feature selec-
tion and simplifies the process; both conceptually and
practically.
Also, the usage of a laser scanner results in a
holistic approach for the VI. Rather than detecting a
crack on a plane, we reconstruct the entire cross sec-
tion. Such an approach provides further information
regarding the status of the infrastructure. The time re-
quired for the 3D reconstruction is minimized, since
RGB-SpectralImaging 2016 - Special Session on RBG and Spectral Imaging for Civil/Survey Engineering, Cultural, Environmental,
Industrial Applications
726
we reconstruct only the areas with possible defects,
rather than the entire infrastructure.
The created 3D point clouds support the engi-
neers’ effort for a detailed assessment. At first, recre-
ated cross sections are compared to the original ones,
as defined in the construction blueprints. Therefore
we are able to prevent deterioration or collapsing, sav-
ing human lives. Secondly, the 3D models can be
stored for future reference, creating a history log for
the infrastructure. This kind of information is cru-
cial to any structural monitoring system and is easily
accessible, since the point clouds are using standard
formats.
2 PROCESSING CHALLENGES
Ideally, the VI approach should have equivalent abili-
ties to human eye inspection. The process should not
be affected by angle and distance from the tunnel sur-
faces, neither from luminosity conditions. Proximity
sensors, advanced navigation, and lighting equipment
facilitate the acquisition process but cannot guaran-
tee ideal conditions, nor bypass man made occlusions
(e.g. wires).
Yet, even if we achieve ideal conditions, the defect
types make the problem increasingly difficult for the
detection mechanism. The term ”defect” can be inter-
preted in many ways; deformations, cracks, surface
disintegration, and other defects are widely known
and commonly appear.
Cracks is a common defect; there are structural
failure cracks, random cracks, crazing cracks, shrink-
age cracks and map-cracking among many others.
Cracks can be described as curved lines of various
lengths and widths. Change in texture, width variance
and discontinuities are, also expected. They appear,
primary, as a result of surface disintegration. in more
severe cases, they result as severe errors in construc-
tion.
Disintegration of the surface is generally caused
by three types of distress: (a) dusting, due to carbon-
ation of the surface by unventilated heaters or by ap-
plying water during finishing, (b) ravelling or spalling
at joints, when pieces of concrete from the joint edges
are dislodged and, (c) breaking of pieces from the sur-
face of the concrete, generally caused by delamina-
tions and blistering.
Other defects include discolouration of the con-
crete, small voids (bugholes) in the surface of vertical
concrete placements, and honeycombing, which is the
presence of large voids in concrete. Figure 1 illus-
trates some of the described defects. Such a variety in
defect types hinders the feature extraction process; it
is difficult to construct appropriate descriptors.
Fortunately, aforementioned defects have some-
thing in common; crack appearance. Cracks appear
in concrete usually as secondary symptoms of other
defects. As such, the identification of a crack should
be the first step, prior to an extensive analysis in the
surrounding area, using laser scanners.
3 IMAGE ANALYSIS
CNNs can be used as hierarchical computer vision
(CV) schemes, among many other algorithms(e.g.
(Doulamis, 2014; Doulamis and Matsatsinis, 2012)),
in order to make the recognition process just-in-
time, and thus significantly reduce the time and effort
needed for visual inspections. Nevertheless, CNN ini-
tialization requires a lot of resources and time. A sec-
ondary image processing mechanism could save valu-
able resources (Protopapadakis and Doulamis, 2015).
The detection of defects can be seen as an image
segmentation problem, which entails to the classifica-
tion of each one of the pixels in the image into one
of two classes: defects (cracks) class and non-defects
class. Such a task requires the description of pixels by
a set of highly discriminative features that fuse visual
and spatial information. CNN is responsible for the
creation of such features.
At first grayscale image patches are created over
RGB tunnel’s surfaces images. These patches consist
the CNN’s input. Through a hierarchical construction
process, complex, high-level features are created for
each patch. These features are fed to a MLP that con-
ducts the classification task. As such, visual and spa-
tial information about a specific pixel, located in the
center of each patch, is related to its neighbour pixels.
Concretely, in order to classify a pixel p
xy
, located
at (x, y) point on image plane, we use a square patch
of size s × s centered at pixel p
xy
. If we denote as
l
xy
the class label of the pixel at location (x, y) and
as b
xy
the patch centered at pixel p
xy
, then, we can
form a dataset D = {(b
xy
, l
xy
)} for x = 1, 2, ··· , w and
y = 1, 2, ··· , h.
These matrices are fed as input into the CNN.
Then, the CNN hierarchically builds complex, high-
level features that encode visual and spatial character-
istics of pixel p
xy
. The output of the CNN is sequen-
tially connected with the MLP. Therefore, obtained
features are used as input by the MLP classifier, which
is responsible for detecting the defects.
Crack Identification Via User Feedback, Convolutional Neural Networks and Laser Scanners for Tunnel Infrastructures
727
(a) (b) (c) (d)
Figure 1: Illustration of various concrete defect types in tunnel investigation: (a) map cracking, (b) shrinkage crack, (c)
flanking and (d) honeycomb.
Figure 2: Illustration of various images acquired and require further investigation.
3.1 Model Initialization
The crack detection is based solely on the CNN. Yet,
in order to facilitate the data set creation, we employ
an image processing technique. Such technique ex-
ploit: the intensity of pixels and their spatial relations,
morphological operations and filtering schemes. Ad-
ditionally, we perform simple shape analysis on de-
tection results. Such an approach does not require an-
notated data. Yet, it has low generalization ability and
it has to be fine tuned for a specific dataset; there are
many parameters in the operators.
The data set creation mechanism needs only a few
images, which are easily obtained (usually at first few
meters, after the tunnels entrance). If there is photo-
graphic material, for the specific infrastructure, from
previous examinations, few pictures will be selected
at random. After the image gathering is complete, the
image annotation process is performed.
Shape properties describe the ratio length to width
for the detected edges. Further, we are looking for
curves. Regarding intensity, pixels corresponding to
cracks are expected to be darker than their neighbour-
ing pixels. Thus, based on cracks characteristics our
approach consists of the following steps: (a) Lines
enhancing, (b) Noise removal, (c) Straight lines re-
moval, (e) Shape filtering, and (f) Morphological re-
construction.
Line enhancement occurs by comparing the inten-
sity of the specific pixel to its neighbours. Such an
approach result in ”salt and pepper” noise. The next
step focuses on noise removal, exploiting a traditional
median filter. Straight lines, is something common,
and correspond to man made crafts (e.g. wiring) are
located according to Hough transform by threshold-
ing the detected outputs.
Shape filtering using appropriate moments is an-
other crucial step. By locating minimum enclosing
circles we are able to exclude symmetrical areas. Fi-
nally, we perform a classical morphological operation
called ”opening by reconstruction”. Reconstruction
starts from a set of starting pixels and then grows in
flood-fill fashion to include complete connected com-
ponents. A step by step illustration of the image pro-
cessing approach is shown in fig. 4.
3.2 Deep Learning Defect Recognition
In this section, for the sake of completeness, we
briefly describe the notion of CNNs. CNNs apply
trainable filters and pooling operations on their input
resulting in a hierarchy of increasingly complex fea-
tures. Convolutional layers consist of a rectangular
grid of neurons (filters), each of which takes inputs
from rectangular sections of the previous layer.
Each convolution layer is followed by a pooling
layer that subsamples block-wise the output of the
precedent convolutional layer and produces a scalar
output for each block. Formally, if we denote the k-
th output of a given convolutional layer as h
k
whose
filters are determined by the weights W
k
and bias b
k
then the h
k
is obtained as:
h
k
i j
= g((W
k
x)
i j
+ b
k
) (1)
where x stands for the input of the convolutional layer
and indices i and j correspond to the location of the
input where the filter is applied. Star symbol (*)
stands for the convolution operator and g(·) is a non-
linear function. Max pooling layers simply take some
k × k region and output the maximum value in that
region.
RGB-SpectralImaging 2016 - Special Session on RBG and Spectral Imaging for Civil/Survey Engineering, Cultural, Environmental,
Industrial Applications
728
Figure 3: Proposed CNN illustration. Every image patch of size 9 × 9 is convolved with 15 kernels of size 5 × 5 at first, then
30 convolution kernels of size 3 × 3 are used. The final convolution layer uses, also, 45 kernels of size 3 ×3.
(a) (b) (c) (d)
(e) (f)
(g)
(h)
Figure 4: Step by step illustration of the proposed image processing approach; (a) original image, (b) grayscale (c) enhanced
lines, (d) binary image , (e) noise removal, (f) area filtered, (g) straight lines removal and (h) final, annotated image.
Max pooling layers introduce scale invariance to
the constructed features, which is a very important
property for object detection/recognition tasks, where
scale variability problems may occur.
However, as pointed in (Makantasis et al., 2015),
for the problem of tunnel defects detection, we in-
volve CNNs to construct features that encode spatio-
visual information, which indicates the presence or
absence of a defect to a specific pixel. Thus, scale
invariance, which is addressed through the use of a
Gaussian pyramid, does not consist a significant prop-
erty for our learning model. Due to this fact, we do
not involve pooling layers into our CNN architecture.
4 GEOMETRY INSPECTION
The tunnel inspection method presented in the current
paper, is performed using not only photogrammetric
equipment but also a laser scanner. In general, laser
scanners provide a number of useful functionalities
for modelling and monitoring tunnels. However, sev-
eral aspects should be considered while planning the
scan of a tunnel to ensure the requirement accuracy
and reliability of the measurements, together with a
cost effective approach to tunnel surveying:
Laser scanner specifications: quality and resolu-
tion, connectivity options, even portability, in case
the device is attached on a moving vehicle (e.g.
robotic system).
Laser scanner accuracy: it is a feature that should
Crack Identification Via User Feedback, Convolutional Neural Networks and Laser Scanners for Tunnel Infrastructures
729
Figure 5: Illustration of crack identification ground truth annotations based on image processing techniques. Such an approach
results in many false positive annotations; areas close to the actual cracks are, also, considered as cracks. Areas with significant
variation in intensity, texture, or entropy are also faulty classified as cracks.
Figure 6: Illustration of crack identification using CNN. In contrast to the image processing approach, outputs are much
refined. The ground truth annotations are limited to the actual cracked regions. A few misclassified pixels may appear at the
crack borders.
be considered separately, since it defines the dis-
tinctive ability of the scanner and the objects and
features of the tunnel surface which can be de-
tected.
Tunnel dimensions: tunnels total length and width
should always be considered in the optimal design
of the survey control network.
Scan time requirements: the relation between res-
olution/quality and scan duration should be exam-
ined, since a time consuming scan could increase
exponentially the overall time requirements of the
tunnel inspection procedure.
In the current project the FARO
R
Laser Scan-
ner Focus
3D
X 130 was used, a mid-range instru-
ment with panoramic architecture, which uses phase
shift technology to measure distance. Table 1 con-
tains some significant technical features of the spe-
cific laser scanner.
Table 1: FARO Focus
3D
specifications.
Distance accuracy up to ±2mm
Range 0.6 m up to 130 m
Noise reduction up to 50%
Field of view 360
× 305
It is obvious from the absolute values of the dis-
tance accuracy, that the Focus
3D
is not suitable for
the detection of small-sized features of the tunnel sur-
face (e.g. cracks). However, the device could be used
to detect features which exceed the minimum value
of 2 mm, or possible deformations of the tunnel cross
section. In the following section a method for the ex-
traction of the tunnel geometry from a point cloud is
presented, along with the results from several experi-
mental scans.
4.1 Tunnel Geometry and Point Clouds
In order to derive the geometry of the tunnel cross
section, a surface of known equation is presupposed.
Usually, the inner surface of a tunnel has a quadratic
form, e.g. circle, parabola, or an assembly of circular
arcs. In this paper, each scan, i.e. one half of the
tunnel cross section, is treated as a part of an elliptic
cylinder, which has a cross section of the following
form:
x
2
α
2
+
y
2
β
2
= 1 (2)
A nonlinear least-squares solver is utilized to solve
the surface fitting problem of the general form
max
x
k f (x)k
2
2
=min
x
( f
1
(x)
2
+ f
2
(x)
2
+ ... + f
n
(x)
2
)
(3)
where x represents the vector of the unknown vari-
ables. It should be noted that since the laser scan-
ner is not located at the centre of the elliptic cylinder,
the translation and rotation parameters should also be
calculated. Therefore, a total number of eight param-
eters are calculated with the nonlinear least-squares
algorithm, including three rotation parameters, three
translation parameters and the two parameters of the
ellipsis. The trust-region-reflective method (Coleman
and Li, 1996; Coleman and Li, 1994) implemented in
matlab environment is used as a minimization algo-
rithm.
An initial surface estimation is performed to cal-
culate outliers. Specifically, measured points that are
not in close proximity to the surface, based on a
user specified threshold, are excluded from the cal-
culations. The parameter value estimation procedure
is applied once again, based on the corrected point
RGB-SpectralImaging 2016 - Special Session on RBG and Spectral Imaging for Civil/Survey Engineering, Cultural, Environmental,
Industrial Applications
730
(a) (b)
Figure 7: Aspect of the experimental tunnel (a) and a rough 3D model (b).
dataset. Features of the tunnel, of considerable size
compared to the distinctive ability of the laser scan-
ner, are detected as the discrepancy between the cal-
culated geometric surface and the measured points.
Temporal changes in the calculated surface parame-
ters could be used as an indication of tunnel defor-
mations, and provide structural engineers with mea-
suring results regarding the extend of these deforma-
tions.
5 SYSTEM EVALUATION
The proposed system was developed on a conven-
tional laptop with quad-core CPU, 8GB RAM, us-
ing Theano library (Bastien et al., 2012) in Python.
The CNN is compared against the image processing
approach, described in sec. 5.1. Compatibility with
robotic parts has been also verified using YARP (Fitz-
patrick et al., 2008).
All the images originate from Metsovo motorway
tunnel in Greece, which is a 3.5km long twin tunnel.
In a distance of 20m parallel and north to this bore,
runs the ventilation tunnel. The main tunnel suffered
a significant deformation due to water inflow. Image
data were captured at this part of the tunnel, using a
hand held DSLR camera. Figure 2 illustrates various
tunnel images during data acquisition process. Re-
gions depicting defects, for each of the captured im-
ages, were manually annotated, by experts(i.e. about
100 images).
Proposed approach is applied on the raw image
data and its performance is evaluated in regard to the
ground truth data. The unbalanced nature of classes
would deteriorate the performance of the system; de-
fects span very few areas. As such, we truncate the
non-defects class to contain the same number of sam-
ples as the class that represents defects. The final
dataset that is used for training and testing, is created
by concatenating the elements of the two classes.
5.1 Crack Detection using Image
Processing Techniques
The image processing technique is similar to the
work of (Protopapadakis and Doulamis, 2015). Line
enhancement is performed on 13 × 13 windows by
thresholding the 0.99% of the mean intensity value.
Then, areas spanning less than 550 pixels are consid-
ered noise and, thus, excluded. Hough transform dis-
tance and angle resolution were set to 5 pixels and 0
radians respectively. Finally, areas of defects should
span at least 30% of the minimum enclosing circle.
An indicative result of the annotated images can be
found in fig. 5.
5.2 Crack Detection using CNN
The input of the CNN are patches of dimensions s ×s.
The parameter s determines the number of neighbours
of each pixel that will be taken into consideration dur-
ing classification task. During experimentation pro-
cess we set the parameter s to be equal to 9, in order
to take into consideration the closest 24 neighbours of
each pixel.
The value of s can be increased. Yet, an increase
in s value results in an increase of computational cost.
In our case, setting the parameter s to a value greater
than 9, resulted in no further performance’s improve-
ment; value of s over 13 resulted in worse classifica-
tion accuracy.
Having estimate the value s, we can proceed with
the CNN architecture design. The first layer of the
proposed CNN is a convolutional layer with C
1
= 15
trainable filters of dimensions 5 × 5. This layer de-
livers C
1
matrices of dimensions 5 × 5 (during con-
volution we do not take into consideration the border
of the patch). Due to the fact that we do not employ
a max pooling layer, the output of the first convolu-
tional layer is fed to the second convolutional layer
(30 kernels of size 3 × 3). Then, the third layer (45
Crack Identification Via User Feedback, Convolutional Neural Networks and Laser Scanners for Tunnel Infrastructures
731
(a) (b)
Figure 8: A more detailed 3D model cross section of the inspected tunnel for left (a) and right (b) side respectively.
Table 2: Metrics for quatnitative performance evaluation.
Metric Formula
Sensitivity - (TPR) TPR = TP / P
Specificity - (SPC) SPC = TN / N
Precision - (PPV) PPV = TP / (TP + FP)
Neg. predictive value - (NPV) NPV = TN / (TN + FN)
False pos. rate - (FPR) FPR = FP / N
False discovery rate - (FDR) FDR = 1 - PPV
Miss Rate - (FNR) FNR = FN / P
Accuracy - (ACC) ACC = (TP + TN) / (P + N)
F1 score - (F1) F1 = 2 TP / (2 TP + FP + FN)
kernels of size 3 × 3) creates the input vectors for the
MLP. An indicative result of the annotated images via
CNN can be found in fig. 6.
5.3 Performance Evaluation
In this paper we have two possible classes; cracks
or non-cracks, named positive (P) and negative (N)
class, respectively. Given the outputs, we form the
confusion table, which is a 2 × 2 matrix that reports
the number of false positives (FP), false negatives
(FN), true positives (T P), and true negatives (T N).
Given these values we are able to calculate various
performance metrics regarding the defect detection
performance.
Calculated metrics formulation is shown in table
2. Metrics of special interest are: Sensitivity (pro-
portional to T P) and miss rate (proportional to FN),
which are both strongly connected to crack detection.
5.4 Experimental Scans in Tunnels
The aspects of sec.4 are examined in a relatively
small-sized test tunnel, located at the campus of the
National and Kapodistrian University of Athens. The
tunnel has a total length of approximately 60 m, and
10 m width. The cross section shape of the tunnel ap-
proximates a semicircle, and the texture of the inner
surface is characterized by regular patterns, due to the
construction process. Several rough edges and sur-
faces which exceed 2 mm, are also present and could
be used as surface features to be detected with a laser
scanner.
Several scans were performed with different com-
binations of quality, resolution, vertical and horizon-
tal range. Figure 7 illustrates an aspect of the tunnel,
as well as a rough 3D depiction, created by the point
cloud. In both scans, in terms of the Focus
3D
param-
eters, the resolution was 1/4 and the quality 4x (i.e.
122 kpt/s).
A number of more detailed scans with higher res-
olution and quality were performed, in order to col-
lect datasets of several tunnel cross sections. Figure 8
presents the results of a cross-sectional scan, with res-
olution 1/1 and quality 3x (i.e. 244 kpt/s). The ver-
tical scan range was from 62.5
to 90
, while the
horizontal range was from 180
to 230
. It should be
noted that two independent scans are required to scan
a complete cross-section of the tunnel.
6 CONCLUSIONS
In this paper, we point the suitability for deep learning
architectures for the tunnel defect inspection problem.
The proposed approach employed both a CNN and a
laser scanner in a holistic assessment scheme, signif-
icantly extending the capabilities of a deep learning
approach.
RGB-SpectralImaging 2016 - Special Session on RBG and Spectral Imaging for Civil/Survey Engineering, Cultural, Environmental,
Industrial Applications
732
Table 3: Performance evaluation score for both CNN and image processing approaches.
Average scores ACC TPR SPC NPV PPV FNR FPR F1
CNN 0.8684 0.8981 0.8387 0.8919 0.8481 0.1019 0.1613 0.8722
Image Processing 0.6337 0.5866 0.6808 0.6284 0.6495 0.4134 0.3192 0.6112
The system evaluates raw images through the
CNN detector. In case of a positive identification of a
crack the laser scanner is used to provide further de-
tails via a method to exploit point clouds, provided by
terrestrial scanners.
The proposed method is based on the parametriza-
tion of the tunnel surface, using a nonlinear minimiza-
tion solver and could be employed to detect possible
deformations or features of considerable size.
Future work may involve CNN topology opti-
mization schemes, for further performance improve-
ment, or hardware implementation to improve detec-
tion times. Additionally, different geometric surfaces
can be investigated, in order to achieve better approx-
imation of the inner tunnel surfaces.
ACKNOWLEDGEMENTS
The research leading to these results has received
funding from the EC FP7 project ROBO-SPECT
(Contract N.611145). Authors wish to thank all part-
ners within the ROBO-SPECT consortium.
REFERENCES
Abdel-Qader, I., Abudayyeh, O., and Kelly, M. E. (2003).
Analysis of edge-detection techniques for crack iden-
tification in bridges. Journal of Computing in Civil
Engineering, 17(4):255–263.
Arg
¨
uelles-Fraga, R., Ord
´
o
˜
nez, C., Garc
´
ıa-Cort
´
es, S., and
Roca-Pardi
˜
nas, J. (2013). Measurement planning for
circular cross-section tunnels using terrestrial laser
scanning. Automation in Construction, 31:1–9.
Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Good-
fellow, I., Bergeron, A., Bouchard, N., Warde-Farley,
D., and Bengio, Y. (2012). Theano: new features and
speed improvements. arXiv:1211.5590 [cs]. arXiv:
1211.5590.
Coleman, T. and Li, Y. (1994). On the convergence of
reflective newton methods for large-scale nonlinear
minimization subject to bounds vol. 67. Ithaca, NY,
USA: Cornell University.
Coleman, T. F. and Li, Y. (1996). An interior trust region ap-
proach for nonlinear minimization subject to bounds.
SIAM Journal on optimization, 6(2):418–445.
Doulamis, A. (2014). Event-driven video adaptation: A
powerful tool for industrial video supervision. Mul-
timedia Tools and Applications, 69(2):339–358.
Doulamis, A. and Matsatsinis, N. (2012). Visual under-
standing industrial workflows under uncertainty on
distributed service oriented architectures. Future Gen-
eration Computer Systems, 28(3):605–617.
Fitzpatrick, P., Metta, G., and Natale, L. (2008). Towards
long-lived robot genes. Robotics and Autonomous
Systems, 56(1):29–45.
German, S., Brilakis, I., and DesRoches, R. (2012). Rapid
entropy-based detection and properties measurement
of concrete spalling with machine vision for post-
earthquake safety assessments. Advanced Engineer-
ing Informatics, 26(4):846–858.
Halfawy, M. R. and Hengmeechai, J. (2014). Automated de-
fect detection in sewer closed circuit television images
using histograms of oriented gradients and support
vector machine. Automation in Construction, 38:1–
13.
Han, J.-Y., Guo, J., and Jiang, Y.-S. (2013). Monitoring tun-
nel deformations by means of multi-epoch dispersed
3d lidar point clouds: An improved approach. Tun-
nelling and Underground Space Technology, 38:385–
389.
Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006). A
Fast Learning Algorithm for Deep Belief Nets. Neural
Computation, 18(7):1527–1554.
Hinton, G. E. and Salakhutdinov, R. R. (2006). Reduc-
ing the Dimensionality of Data with Neural Networks.
Science, 313(5786):504–507.
Jahanshahi, M. R., Masri, S. F., Padgett, C. W., and
Sukhatme, G. S. (2013). An innovative methodology
for detection and quantification of cracks through in-
corporation of depth perception. Machine Vision and
Applications, 24(2):227–241.
Kim, Y.-S. and Haas, C. T. (2000). A model for automation
of infrastructure maintenance using representational
forms. Automation in Construction, 10(1):57–68.
Koch, C. and Brilakis, I. (2011). Pothole detection in as-
phalt pavement images. Advanced Engineering Infor-
matics, 25(3):507–515.
Liu, Z., Suandi, S. A., Ohashi, T., and Ejima, T. (2002).
Tunnel crack detection and classification system based
on image processing. In Electronic Imaging 2002,
pages 145–152. International Society for Optics and
Photonics.
Makantasis, K., Protopapadakis, E., Doulamis, A. D.,
Doulamis, N. D., and Loupos, C. (2015). Deep Con-
volutional Neural Networks for Efficient Vision Based
Tunnel Inspection. Cluj-Napoca, Romania.
Mohanty, A. and Wang, T. T. (2012). Image mosaicking of
a section of a tunnel lining and the detection of cracks
through the frequency histogram of connected ele-
ments concept. volume 8335, pages 83351P–83351P–
9.
Crack Identification Via User Feedback, Convolutional Neural Networks and Laser Scanners for Tunnel Infrastructures
733
Monserrat, O. and Crosetto, M. (2008). Deformation mea-
surement using terrestrial laser scanning data and least
squares 3d surface matching. ISPRS Journal of Pho-
togrammetry and Remote Sensing, 63(1):142–154.
Nuttens, T., Stal, C., De Backer, H., Schotte, K., Van Bo-
gaert, P., and De Wulf, A. (2014). Methodology for
the ovalization monitoring of newly built circular train
tunnels based on laser scanning: Liefkenshoek rail
link (belgium). Automation in Construction, 43:1–9.
Peji
´
c, M. (2013). Design and optimisation of laser scan-
ning for tunnels geometry inspection. Tunnelling and
Underground Space Technology, 37:199–206.
Protopapadakis, E. and Doulamis, N. (2015). Image
Based Approaches for Tunnels Defects Recognition
via Robotic Inspectors. In Bebis, G., Boyle, R.,
Parvin, B., Koracin, D., Pavlidis, I., Feris, R., Mc-
Graw, T., Elendt, M., Kopper, R., Ragan, E., Ye, Z.,
and Weber, G., editors, Advances in Visual Comput-
ing, number 9474 in Lecture Notes in Computer Sci-
ence, pages 706–716. Springer International Publish-
ing. DOI: 10.1007/978-3-319-27857-5 63.
Pynn, J., Wright, A., and Lodge, R. (1999). Automatic iden-
tification of cracks in road surfaces. In Image Process-
ing and Its Applications, 1999. Seventh International
Conference on (Conf. Publ. No. 465), volume 2, pages
671–675 vol.2.
Sinha, S. K. and Fieguth, P. W. (2006). Automated detection
of cracks in buried concrete pipe images. Automation
in Construction, 15(1):58–72.
Son, H., Kim, C., and Kim, C. (2012). Automated Color
ModelBased Concrete Detection in Construction-
Site Images by Using Machine Learning Algo-
rithms. Journal of Computing in Civil Engineering,
26(3):421–433.
Tung, P.-C., Hwang, Y.-R., and Wu, M.-C. (2002). The
development of a mobile manipulator imaging system
for bridge crack inspection. Automation in Construc-
tion, 11(6):717–729.
Yu, S.-N., Jang, J.-H., and Han, C.-S. (2007). Auto inspec-
tion system using a mobile robot for detecting con-
crete cracks in a tunnel. Automation in Construction,
16(3):255–261.
RGB-SpectralImaging 2016 - Special Session on RBG and Spectral Imaging for Civil/Survey Engineering, Cultural, Environmental,
Industrial Applications
734