Recognition of Urban Transport Infrastructure Objects Via

Hyperspectral Images

Oleg Saprykin

, Alexander Fedoseev

and Tatyana Mikheeva

1,2

Samara State Aerospace University, 34, Moskovskoye Shosse, Samara, Russia

Scientific and Production Centre «Intelligent Transportation Systems», Samara, Russia

Keywords: Convolutional Neural Networks, Deep Machine Learning, Hyperspectral Images, Transport Infrastructure,

Image Recognition.

Abstract: Actualization of vector maps of the urban transport infrastructure, including street and road network, in

conditions of constant changes is a resource-consuming task and it requires the automation of the process.

The article considers the solving of problem of transport infrastructure objects recognition in hyperspectral

images by deep convolutional neural networks. The hyperspectral images from different sources are

considered for solving the problem. We propose a new approach to the formation of receptive fields of

convolutional neural networks: the receptive field covers several pixels, but the depth of the colour channels

is limited. In the proposed approach the receptive field moves in three dimensions - in two spatial

dimensions and in spectral channels dimension. It gives the ability to recognize the transport infrastructure

objects by spatial patterns and spectrum.

1 INTRODUCTION

The modern pace of large cities development entails

a permanent changing of transport infrastructure.

This is especially noticeable at the stage of preparing

the city for receiving a major sporting or cultural

event. In general, the changes in the transport

infrastructure are determined by several factors:

• steady increase in the level of motorization in

the cities;

• construction of new residential buildings;

• reconstruction and building of engineering

facilities;

• construction of new sociocultural and sports

facilities;

• expanding the boundaries of the city;

• growing demand of citizens to transport

accessibility.

Changes in transport infrastructure in most cases

are systematized, but at the moment there are no

clear mechanisms for notification of all involved

organizations and services. Particular difficulties are

experienced by non-governmental organizations

distributing cartographical information or offering

services based on it. Actualization of vector maps of

the city street and road network in conditions of

constant changes becomes a task, which requires

involvement of a large number of resources.

The solution of the problem of timely updating

the map data is possible by the automation of the

process. One of the methods is recognition of

satellite images of areas. At the same time, the use

of ordinary photos is associated with the problem of

incomplete data and as a consequence of the poor

quality of recognition. When operating with a city

map it is advisable to use hyperspectral images

because they contain a larger amount of information

at each point in the image, which greatly improves

the quality of transport infrastructure recognition.

Hyperspectral measurements for physical-

chemical properties assessment help to evaluate

road-transport infrastructure objects conditions. This

research trend is concerned in papers (Resende et al.,

2014; Mei et al., 2014; Cavalli et al., 2008; Wei et

al., 2009; Herold et al., 2004a; Gomez, 2002;

Miraliakbari and Hahn, 2014).

Hyperspectral images are third dimensional data

array which consists of spatial information about

object and spectral information for each spatial

coordinate. Each pixel of hyperspectral image is

attributable to its spectral feature. Information is

represented in tens and hundreds of neighboring

bands (about 5-10 nm). Frequently hyperspectral

Saprykin, O., Fedoseev, A. and Mikheeva, T.

Recognition of Urban Transport Infrastructure Objects Via Hyperspectral Images.

In Proceedings of the International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS 2016), pages 203-208

ISBN: 978-989-758-185-4

203

information is represented like “hypercube”

(Figure 1).

For effective solving of problems mentioned

above hyperspectral data must have high spatial

resolution and must span spectral region from 0.4 to

2.5 μm. The important aspect is development of road

pavement spectral library for different classes and

different conditions and typical materials for urban

territory on the basis of field data acquired by hand

spectroradiometer.

Figure 1: Schema of “hypercube” formation.

The process of transport infrastructure objects

monitoring is associated with a range of features

which is defined by a necessity of preliminary

processing as well.

Firstly, given that in the three-dimensional land

surface structure, road-transport infrastructure

objects are the “bottom layer” that can be covered or

shadowed by surrounding surfaces such as trees,

buildings or vehicles.

Secondly, the problem of hyperspectral data

processing would be solved essentially more easily

if all image pixels were “pure”, i.e. each pixel

contains information only about the single object.

However, natural surfaces rarely consist of

homogeneous material. Furthermore, the total

radiation from all of the objects inside the spatial

resolution element is registered by the sensor as

single image pixel. Therefore in general the

operator-user deals with the so called “mixed pixel”.

The mixture dynamics of two or more materials

inside the single pixel can be described by linear and

non-linear models (Keshava, 2003; Kukharenko,

2013).

Thirdly, remote sensing hyperspectral data

contains information not only about the surface but

also about the atmosphere conditions. The

atmospheric correction procedure intends for

rejection of this warping factor and image

transformation from spectral brightness units to

spectral reflectance index units (Mikheeva and

Fedoseev, 2014; Zhuravel and Fedoseev, 2013;

Yuanliu et al, 2007; Schowengerdt, 2010, Schott,

2007).

Finally, the spectral profiles of transport

infrastructure objects frequently are similar to

spectral profiles of typical urban infrastructure

artificial objects (roofs of buildings, engineering

structures). This fact can influence negatively to

results of hyperspectral data processing (Herold et

al., 2004a).

To get the satisfactory results during the usage of

high resolution hyperspectral images for monitoring

and evaluation of road-transport infrastructure

objects conditions, several processing stages must be

applied (in the case of correct initial data are

prepared) (Cavalli et al, 2008; Chang, 2000;

Gualtieri and Cromp, 1999; Ratle et al., 2010).

Generally the process of thematic processing can

be divided into two main stages (Resende et al.,

2014):

• objects of interest detection and extraction;

• classification of road-transport infrastructure

objects.

To extract the road pavement the algorithms of

controlled classification are used. These algorithms

require spectral samples availability (Herold et al.,

2004b).

In this case spectral samples are contained in

spectral library, which is filled up by the

measurements from field and aviation

hyperspectrometer. The algorithms of controlled

classification offer two approaches: determinate and

statistical.

The determinate approach is used in the case

when objects classes don’t overlap in the feature

space (Schowengerdt, 2010). However, natural and

artificial objects are generally nonhomogeneous and

spectral characteristics of research objects are

similar or particularly overlapped (for example, for

different types of soils and road pavements).

VEHITS 2016 - International Conference on Vehicle Technology and Intelligent Transport Systems

204

Therefore the classification methods which are

based on statistical approach for feature variations

considering and accept to attribute of pixels to

another’s classes if the frequency of their appearance

is low have been popular (Chandra, 2008).

Despite of extensive researches in the application

of hyperspectral images, their usage in solving the

problem of recognition of transport infrastructure

objects is associated with a number of difficulties

described above. One of the methods leveling these

difficulties is the application of artificial intelligence

techniques to solve the problem (Saprykin and

Saprykina, 2015). In recent years the convolutional

neural networks have proved themselves in the field

of image processing. The researches are actively

conducted in the field of recognition of images,

consisting of the three color channels (Krizhevsky et

al., 2012; Simonyan and Zisserman, 2014).

However, the processing of hyperspectral imaging is

studied insufficiently, and more research is

necessary to find the optimal network architecture

and training algorithms. This article considers the

problem of recognition of transport infrastructure

objects in hyperspectral images by deep

convolutional neural networks.

2 CONVOLUTIONAL NEURAL

NETWORKS

Recent researches have shown great success of

convolutional neural networks in images

recognition. The architecture and training algorithms

of such neural networks are similar to ordinary

feedforward networks, but they are optimized for

handling large amounts of input data. The input

layer of convolutional neural networks is

represented as 3-D data set. When passing through

the layers of the neural network the size of the input

array is changed, and eventually it is reduced to one-

dimensional array, which is easily treated by a

conventional feedforward neural network (Figure 2).

Such transformation with retention of high learning

ability requires a large number of layers, so it is

reasonable to use deep convolutional neural network

(Simonyan and Zisserman, 2014).

The convolutional neural network consists of the

following types of intermediate layers: convolutional

layers, max pooling layers and fully connected

layers. Convolutional layers serve to identify the

characteristics of facilities in accordance with pre-

trained patterns. Max pooling layers allow to select

the strongest signal from the considered region and

reduce the size of data array. At the final stage of

data processing the fully connected layer is used,

which directly determines what class the facility

described by the input data set is (Krizhevsky et al.,

2012).

Figure 2: Schema of reducing of the input data set in

convolutional neural network.

Convolutional neural network is not fully

connected. Each subsequent intermediate network

layer is associated with a small number of neurons in

the previous layer that unites their presence in a

small local area - receptive field. The important

point, accelerating the training and working of the

neural network, is using the same weights for all

receptive fields of the layer (parameters sharing).

When designing the convolutional layer such

parameters as the depth of the output array, stride,

and zero-padding can be varied. By varying the

depth of the output array the number of features

which are recognizable by the layer can be

controlled. Zero-padding is used in the case of the

necessity to preserve the original image size.

Due to the small size of the receptive fields the

convolutional layer may incorrectly detect a feature,

which does not belong to an object. To prevent such

Recognition of Urban Transport Infrastructure Objects Via Hyperspectral Images

205

mistakes it is necessary to zoom-out the considered

area, for this purpose the max pooling layer is used.

Neurons in this layer do not use parameters, and

therefore the training is not required. Their work

comes down to choosing the strongest signal from

the treated area. After passing the array through the

max pooling layer the most characterized object

features are remained.

3 CONVOLUTIONAL NEURAL

NETWORK FOR

HYPERSPECTRAL IMAGES

The initial data for the experiments are hyperspectral

images of Samara region which were acquired in

2013-2014 in 36, 48 and 72 spectral bands in the

range 0.35–1.05 μm. Field quasi-synchronous

measurements via FieldSpec-4 spectroradiometer of

Samara transport infrastructure typical objects have

been used as patterns (Figure 3). Moreover, to

research of hyperspectral data thematic processing

methods we use information acquired by AVIRIS

and HYDICE sensors parallelly in 224 and 191

spectral bands. The spectral range for AVIRIS data

is 0.36–2.5 μm and for HYDICE data is 0.4–2.47

μm. The preliminary processing of initial data is

used for vacant channels filtering and atmospheric

correction. The module FLAASH, which is the part

of program system ENVI, was used for atmospheric

correction. We also used another method of

atmospheric correction called empirical line method.

This method has displayed more accurate results but

it can be used only in the case of spectral patterns

availability in the processing image. It is desirable to

have artificial materials in the image as patterns, or

patterns could be artificial materials under condition

of once only acquisition with the aerospace data. In

the stage of preliminary processing operations of

information dimension reduction has been used. The

most popular methods of dimension reduction are

Principle component analysis (PCA) (Gorban et al.,

2008; Rodarmel and Shan, 2002) and Independent

component analysis (Robila, 2005). PCA has been

used in this research.

Convolutional neural networks are widely used

in the classification of images as they provide good

recognition quality with relatively small effort.

However, when working with hyperspectral images,

this advantage can be substantially reduced because

of the large dimension of the data, since each point

of the image is represented by a vector of hundred or

more values. There is an approach that uses a single

point of image as the receptive field with a full range

of values of spectral vector (Hu et al, 2015). The

disadvantage is the insensitivity of the method to the

spatial patterns, and as a consequence, the inability

to recognize objects by the features.

Figure 3: Spectral characteristics of typical transport

infrastructure objects in Samara region.

We propose a new approach to the formation of

receptive fields, which allows to keep the

advantages of the convolutional networks and use

the information from all color channels of

hyperspectral image. In the proposed approach the

receptive field covers several pixels, but the depth of

the color channels, that can be used simultaneously,

is limited. During operating of the neural network

the receptive field moves not only in the horizontal

plane, as in the current implementations, but also in

the depth of color channels, thus covering the whole

available spectrum. The value of the stride for the

color channels must be less than the depth of

receptive field. This allows to overlap the color

channels, to increase the number of processed

images in different spectra, and thus improve the

quality of recognition.

The described approach of receptive field

formation requires changes in the standard structure

and training algorithm of the convolutional neural

network, since each depth of color channels requires

its own set of weighting coefficients (or filters in

terms of convolutional neural networks). The

requirement is dictated by the fact that the same

spatial filters may be responsible for completely

different features in different spectral channels. To

meet this requirement an extra dimension is

introduced to the array of trained filters. Moving in

this dimension is performed synchronously with the

movement of receptive field to a new depth of

spectral channels. With such work organization the

weights sharing is carried out only in the horizontal

movement of receptive field. During movement

deeper into the spectral channels the weights sharing

is not used. Thus the structure of neural network

VEHITS 2016 - International Conference on Vehicle Technology and Intelligent Transport Systems

206

differentiates the data streams for different spectral

channels.

To implement the described neural network the

TensorFlow framework is chosen, because it has a

clear API and the flexibility to transform

multidimensional data sets (Abadi et al., 2015).

TensorFlow has already had an implementation of

convolutional neural network. This neural network

architecture is highly configurable, that allows to

implement the described differentiation of data

streams by spectral channels. The framework also

allows to use the graphics processor unit (GPU),

which significantly reduces the training time of the

neural network on hyperspectral images of the city's

transport infrastructure.

4 CONCLUSIONS AND FUTURE

RESEARCH

In this article, we have reviewed the main problems

arising during recognition of hyperspectral images

of cities and detection of transport infrastructure

objects on them. The new method of the

classification of hyperspectral images is proposed. It

is based on deep convolutional neural network that

differs from the existing ones by movement of the

receptive field in three dimensions - in two spatial

dimensions and in spectral channels dimension. This

approach makes it easier to recognize the transport

infrastructure objects in dense urban areas.

Further research is related to carrying out a large

number of experiments with hyperspectral images of

cities. It is necessary to compare the results of object

recognition in images taken from different satellites

operating in different spectral ranges and with

different number of spectral channels. It is necessary

to investigate the usage of artificial neural networks

at the stage of clearing and pre-processing of raw

hyperspectral images.

Subsequently, it is necessary to carry out a

comparative description of object recognition quality

of the developed method and the existing methods

(for example, Support Vector Machine, Spectral

Angle Mapper, Maximum Likelihood Method,

Mahalanobis Distance Method, etc.). Comparison of

methods should be carried out by several

parameters, the most important of which are the

accuracy (probability of correct determination of the

class), and receiver operating characteristic curve

(ratio of the probability of true positive outcome and

the probability of false positive outcome). In

addition to the qualitative characteristics, the

performance, scalability and the ability to process

information in concurrent threads should also be

compared.

Further work also needs improving the

convolutional neural network classifying the

transport infrastructure facilities. It is intended the

usage of the latest developments in this area: spatial

factorization, label smoothing and asynchronous

stochastic gradient descent. It is necessary to

increase productivity and quality of recognition to

allow wide application of the method in transport

geographic systems.

Modern intelligent transport systems involve the

usage of unmanned aerial vehicles. To date, the

payload of such vehicles is presented by a wide

range of sensors, including hyperspectral cameras.

The data received from the sensors require a

semantic interpretation. The proposed in this paper

approach to processing of hyperspectral data,

focused on effective recognition of the transport

infrastructure, may be used as a part of spatial data

processing complex in the structure of the modern

intelligent transport system.

ACKNOWLEDGEMENTS

This work was supported by the Ministry of

Education and Science of the Russian Federation.

REFERENCES

Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen,

C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin,

S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M.

Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J.

Levenberg, D. Mane,´ R. Monga, S. Moore, D.

Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I.

Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V.

Vasudevan, F. Viegas, O. Vinyals, P. War- ´ den, M.

Wattenberg, M. Wicke, Y. Yu, and X. Zheng (2015).

TensorFlow: Large-scale machine learning on

heterogeneous systems. Software available from

tensorflow.org.

Cavalli, R., Fusilli, L., Pascucci, S., Piguatti, S. and

Santini, F. (2008). Hyperspectral Sensor Data

Capability for Retrieving Complex Urban Land Cover

in Comparison with Multispectral Data: Venice City

Case Study (Italy), Sensors, 8(5), pp. 3299-3320.

Chandra, A.M. (2008). Remote Sensing and Geografical

Information Systems, Technosphera, Moscow.

Chang, C.I. (2000). An information-Theoretic Approach to

Spectral Variability, Similarity, and Discrimination for

Hyperspectral Image Analysis. Information Theory,

Recognition of Urban Transport Infrastructure Objects Via Hyperspectral Images

207

IEEE Transactions on Information Theory, 46(5), pp.

1927-1932.

Gomez, R.B. (2002). Hyperspectral imaging: a useful

technology for transportation analysis. Optical

Engineering, 41(9), pp. 2137-2143.

Gorban, A., Kegl, B, Wunsch, D. and Zinovyev A. (2008).

Principal Manifolds for Data Visualisation and

Dimension Reduction, Lecture notes in computational

science and engineering, Springer, Berlin –

Heidelberg – New York.

Gualtieri, J.A. and Cromp, R.F. (1999). Support vector

machines for hyperspectral remote sensing

classification, available at: http://ntrs.nasa.gov/archive

/nasa/casi.ntrs.nasa.gov/19990021532.pdf (accessed 5

January 2016).

Herold, M., Roberts, D., Smadi, O and Noronha, V.

(2004a). Road condition mapping with hyperspectral

remote sensing. Available at: http://www.geogr.uni-

jena.de/~c5hema/urbanspec/av04_roadmapping_herol

detal.pdf (accessed 5 January 2016).

Herold, M., Gardner, M, Noronha, V. and Roberts, V.

(2004b). Spectrometry and hyperspectral remote

sensing of urban road infrastructure. Available at:

http://www.eo.uni-jena.de/~c5hema/pub/rse04_herold

etal.pdf (accessed 5 January 2016).

Hu, W., Huang, Y., Wei, L., Zhang, F., and Li, H. (2015).

Deep Convolutional Neural Networks for

Hyperspectral Image Classification. Journal of

Sensors, vol. 2015, Article ID 258619, 12 pages.

Keshava, N. (2003). Survey of Spectral Unmixing

Algorithms. Lincoln Laboratory Journal, 14(1), pp.

55-78.

Krizhevsky, A., Sutskever, I. and Geoffrey E.H. (2012).

ImageNet Classification with Deep Convolutional

Neural Networks. Advances in Neural Information

Processing Systems 25. Curran Associates, Inc. Pp.

1097-1105. Available at: http://papers.nips.cc/paper/48

24-imagenet-classification-with-deep-convolutional-ne

ural-networks.pdf.

Kukharenko, B.G. (2013). Algorithms of analysis of

hyperspectral images components, Supplement of

Journal “Information technologies”, No. 6, - 32 p.

Mei, A., Salvatori, R., Fiore, N., Allegrini, A. and

D’Andrea (2014). A Integration of field and laboratory

spectral data with multi-resolution remote sensed

imagery for asphalt surface differentiation. Remote

sensing, Vol. 6, pp. 2765–2781.

Mikheeva, T.I. and Fedoseev, А.А. (2014). Clusterization

of Hyperspectral Data of Transport Infrastructure

Objects Monitoring. Reporter of Samara Scientific

Center of Russian Academy of Sciences, Vol. 16 No. 4

(2), pp. 435-442.

Miraliakbari, A. and Hahn, M (2014). Development of

multi-sensor system for road condition mapping. The

International archives of the photogrammetry, remote

sensing and spatial information, Vol. XL-1, pp. 265–

272.

Resende, M., Bernucci, L. and Quintanilha, J. (2014).

Monitoring the condition of roads pavement surfaces:

proposal of methodology using hyperspectral images,

Journal of Transport Literature, 8(2), pp. 201–220.

Ratle, F., Camps-Valls, G. and Weston, J. (2010).

Semisupervised neural networks for efficient

hyperspectral image classification, IEEE Transactions

on Geoscience and Remote Sensing, 48(5), pp. 2271–

2282.

Robila, S. (2005). Investigation of Spectral Screening

Techniques for Independent Component Analysis

Based Hyperspectral Image Processing. Available at:

http ://www.cs.uno.edu/~stefan/ (accessed 5 January

2016).

Rodarmel, C. and Shan, J. (2002). Principal component

analysis for hyperspectral image classification.

Surveying and Land Information Science, 62(2), pp.

115-122.

Saprykin, O. and Saprykina, O. (2015). Multilevel

Modelling of Urban Transport Infrastructure. In

Proceedings of the 1st International Conference on

Vehicle Technology and Intelligent Transport Systems

(VEHITS-2015), Portugal, Lisbon: SCITEPRESS, pp.

78-82.

Schott, J. (2007). Remote sensing: the image chain

approach, 2nd ed., Oxford University Press, USA.

Schowengerdt, R.A. (2010). Remote Sensing: Methods

and Models for Image Processing. Technosphera,

Moscow.

Simonyan, K. and Zisserman, A. (2014). Very Deep

Convolutional Networks for Large-Scale Image

Recognition. CoRR. abs/1409.1556.

Yuanliu, X., Runsheng, W. and Shengwen, L. (2007).

Atmospheric correction of hyperspectral data using

MODTRAN model. Proceedings of 16th National

Symposium on Remote Sensing of China, 7 pages.

Zhuravel, J.N. and Fedoseev, A.A. (2013). Specificity of

Hyperspectral Remote Sensing Data Processing for the

Tasks of Environment Monitoring, Computer Optics,

Vol. 37 No. 4. - pp. 471-476.

Wei, J., Zhou, G., Zheng, Z. (2009). Survey and analysis

of land satellite remote sensing applied in highway

transportations infrastructure engineering. Available

at: http://www.asprs.org/a/publications/proceedings/ba

ltimore09/0102.pdf (accessed 5 January 2016).

VEHITS 2016 - International Conference on Vehicle Technology and Intelligent Transport Systems

208