APPLICA

TION OF WALSH TRANSFORM BASED METHOD ON

TRACHEAL BREATH SOUND SIGNAL SEGEMENTATION

Jin Feng, Farook Sattar

School of Electrical& Electronic Engineering, Nanyang Technological University, Singapore

Moe Pwint

Dept. of Information Science, The University of Computer Studies, Yangon, Myanmar

Keywords:

Segmentation, Walsh Transform, Reﬁnement Scheme, Inspiratory/Expiratory Phase, Various Types of Tra-

cheal Breath Sounds, End-Inspiratory/Expiratory Pause.

Abstract:

This paper proposes a robust segmentation method for differentiating consecutive inspiratory/expiratory

episodes of different types of tracheal breath sounds. This has been done by applying minimal Walsh ba-

sis functions to transform the original input respiratory sound signals. Decision module is then applied to

differentiate transformed signal into respiration segments and gap segments. The segmentation results are

improved through a reﬁnement scheme by new evaluation algorithm which is based on the duration of the seg-

ment. The results of the experiments, which have been carried out on various types of tracheal breath sounds,

show the robustness and effectiveness of the proposed segmentation method.

1 INTRODUCTION

For early detection of diverse illnesses, accurate es-

timation of respiratory rate is very important (Sierra

et al., 2005). Many adventitious lung sounds, which

are indications of infectious and respiratory diseases,

can be clinically characterized by their duration in

respiratory cycle and relationship to the phase of res-

piration (Meslier et al., 1995). Therefore, segmenta-

tion of respiratory sound into individual respiratory

cycles and further subdividing into its inspiratory and

expiratory phases is necessary in quantifying adventi-

tious sounds.

Generally, phonopneumography or spirometer to-

gether with sound recording devices are always used

in respiratory sound analysis, in which amplitude of

the sound signal is displayed simultaneously with

the airﬂow as a function of time. Signals can

be segmented into consecutive inspiratory phase,

end-inspiratory pause, expiratory phase, and end-

expiratory phase according to the provided Forced

Expiratory Volume (FEV) readings (Taplidou and

Hadjileontiadis, 2007)(Cort ´es et al., 2005). However,

it could be difﬁcult to carry out a spirometric test

for patients with high obstruction in tracheal (Cort ´es

et al., 2005).

Acoustical ﬂow estimation is one of the ﬁrst at-

tempts to relate respiratory sounds and ﬂow. In (Hos-

sain and Moussavi, 2002) and (Golabbakhsh, 2004),

airﬂow has been estimated using the respiratory

sounds by applying different models, while exponen-

tial model between ﬂow and averaged sound power

has been found with the highest estimation accu-

racy. The model coefﬁcients calculation in the above

mentioned methods require samples of breath sound

with known ﬂow. However, the calibration process is

not always possible. Therefore, a modiﬁed entropy-

based linear model describing relationship between

ﬂow and tracheal sound has been derived in (Yadol-

lahi and Moussavi, 2006) without prior acoustical

ﬂow knowledge. Also, other segmentation methods

using spectral and temporal analysis of transformed

respiratory sounds have been developed in (Hult et al.,

2000)(Sierra et al., 2004). As these researches are still

in preliminary stage, the segmentation is restricted

to normal tracheal breath and the accuracy depends

mainly on signal-to-noise ratio (SNR) for various

types of tracheal breath sounds.

In this paper, an automatic and robust respiratory

sound signal segmentation method is developed. The

proposed method is based on the modiﬁcation of input

sound signal using a modiﬁed analysis and synthesis

116

Feng J., Sattar F. and Pwint M. (2008).

APPLICATION OF WALSH TRANSFORM BASED METHOD ON TRACHEAL BREATH SOUND SIGNAL SEGEMENTATION.

In Proceedings of the First International Conference on Bio-inspired Systems and Signal Processing, pages 116-121

DOI: 10.5220/0001057501160121

 SciTePress

scheme based on Walsh basis functions. Without the

aid of any other features, a decision module is then

applied on the modiﬁed signal by adaptive threshold-

ing for segmentation. The preliminary segmentation

result is optimized lastly by the reﬁnement scheme

based on the segment duration. This scheme ensures

the segmentation process to perform equally accu-

rate irrespective of ﬂow and types of tracheal breath

sounds. The proposed method is tested to be effec-

tive for both normal tracheal breath sounds as well as

adventitious respiratory sounds such as, wheeze and

stridor.

2 BACKGROUND

The Walsh transform is a matrix consisting of a com-

plete orthogonal function set having only two values

+1 and -1 over their deﬁnition intervals (Beauchamp,

1984). The motivation for using Walsh transform

rather than other transforms is its computational sim-

plicity giving a realistic processing time. The Walsh

function of order N can be represented as

g(x, u) =

q−1

∏

i=0

(−1)

(x)b

q−1−i

(u)

(1)

where u = 0, 1, ..., N − 1, N = 2

and b

(x) is the i-

-th bit value of x. In this context, the Walsh functions

are arranged into sequential order, the number of zero

crossings of Walsh function per deﬁnition interval, to

obtain a set of basis functions. The number of zero

crossings increases with the order of basis functions

W = [φ

, φ

, ··· , φ

N−1

3 PROPOSED SEGMENTATION

METHOD FOR RESPIRATORY

SOUND SIGNAL

The proposed respiratory sound signal segmentation

approach is based on segmentation of the respira-

tory sounds using Walsh functions. The segmentation

method is based on the reconstruction/modiﬁcation of

the analyzed signals by efﬁcient linearly combined

Walsh functions. A simple decision scheme is then

followed for segmentation of our recorded respira-

tory sound signals based on the statistics of the mod-

iﬁed/reconstructed signal. The details of our minimal

Walsh functions based segmentation method is pre-

sented here.

3.1 Modiﬁcation of Signal

The modiﬁcation of the input signal consists of two

stages - sinusoidal signal analysis (Arﬁb et al., 2002)

followed by our signal reconstruction scheme using

minimal Walsh functions.

3.1.1 Signal Analysis

The input signal x(n) is multiplied by a Hann win-

dow to yield successive windowed segments of x

(n).

These window segments are mapped into the spec-

tral domain by using FFTs. In this way, a time

varying spectrum X

(n, k) = |X

(n, k)|e

jϕ(n,k)

with n =

0, 1, ..., N − 1 and k = 0, 1, ..., N − 1 for each win-

dowed segment is obtained. Here, X

(n, k) denotes

the spectral component of the input signal at fre-

quency index k and time index n, while |X

(n, k)| and

ϕ(n, k) denote the time-varying magnitude and phase

responses, respectively.

3.1.2 Modiﬁed Signal Synthesis

The recorded input respiratory signal is reconstructed

as a modiﬁed sequence based on our modiﬁed anal-

ysis/synthesis approach. Prior to synthesis, each s-

-th windowed segment is modiﬁed as the weighted

sum of the magnitude |X

(n, k)| using binary Walsh

basis functions. Using basis functions, the number of

parameters required to track along the variations of

the inspiration and expiration phases of the noisy sig-

nal can be reduced. For this reason, SVD (Singular-

Value Decomposition) is used to determine the mini-

mal number of Walsh basis functions to be applied.

The detailed procedure for the identiﬁcation of the

minimal number of Walsh basis functions and the new

modiﬁed basis function used based on the selected

basis functions, are described in the following sec-

tion. Applying the i-th basis function φ

, a modiﬁed

sequence, y

(n), for each windowed segment is then

obtained as

(n) =

N−1

∑

k=0

(n, k)|.φ

(k) (2)

All the modiﬁed segments are ﬁnally concatenated

to generate an output signal y(n) having the time-

varying magnitude responses.

y(n) =

S−1

∑

s=0

(n − sN) (3)

3.1.3 Selection of Minimal Walsh Functions for

Modiﬁed Synthesis

It is very important to select appropriate basis func-

tions so that variations between the dynamics of

APPLICATION OF WALSH TRANSFORM BASED METHOD ON TRACHEAL BREATH SOUND SIGNAL

SEGEMENTATION

117

the two phases can be captured more precisely. A

method used to select the global natural scale in dis-

crete wavelet domain (Quddus and Gabbouj, 2002)

is adopted to determine the minimal number of basis

functions. This method adaptively selects the optimal

scale using SVD, while decomposition is being car-

ried out. Consider an input noisy respiratory signal x

of length V , and y

(ν) be its modiﬁed sequence ob-

tained by applying the basis functions of order d into

Eq(2) and Eq(3). Modiﬁed sequences {y

(ν)}

D−1

d=0

can

be represented as a matrix of size D × V . To de-

termine the order of basis functions with dominant

eigenvalues, the SVD of the D × V matrix is calcu-

lated adaptively begin with the ﬁrst two orders (i.e.

and φ

) while adding the Walsh functions of higher

orders.

Here, the proposed algorithm deﬁnes the minimal

order of basis functions N

min

as 3 throughout the sim-

ulations and found very robust against various situa-

tions. In the original algorithm (Quddus and Gabbouj,

2002), optimal scale is deﬁned as the average of the

details from the ﬁrst level to the natural scale, the level

associated with the dominant eigenvalue. However,

this averaging may introduce clipping effect for the

signals at low signal level. To avoid this effect, a shift-

ing operator which swaps the right and left halves of

the basis function coefﬁcients is applied ﬁrst. Then a

good estimate of a modiﬁed binary Walsh basis func-

tion within dominant eigenvalues is deﬁned as

−

min

∑

i=1

CS(φ

)

max{|φ

−

min

∑

i=1

CS(φ

)|}

(4)

where N

min

= 3 is the largest order referring to the

most prominent eigenvalues and CS(·) is the shifting

operator. This new basis function φ

provides sharper

representation and higher discriminating features.

3.2 Decision Strategy

3.2.1 Preliminary Decision Module

First, 0-order basis function, φ

is used to produce a

modiﬁed sequence, y

(ν), to get the global informa-

tion of the original sample signals. This modiﬁed se-

quence is used as a reference or pilot sequence as used

in the areas of telecommunication. Containing the

local characteristics, another modiﬁed signal, y

(ν),

is formed using the new basis function φ

. From

this new sequence, locations and durations of inspi-

ration and expiration phases can be located more pre-

cisely even for adventitious respiratory sounds such

as wheeze and stridor. In this way, approximate loca-

tions of inspiration and expiration segments are ﬁrst

determined from the modiﬁed signal, y

(ν). Then,

the results to determine respiratory phases can be im-

proved by using the second modiﬁed signal, y

(ν),

which contains the detailed information. Applying

the reconstructed signals y

and y

, the procedure of

detection scheme can be described as below:

• Extract two sequences of local minima, {α

}

i=1

and {α

}

i=1

, where L is the number of frames,

from every 4 ms frame of y

(ν) and y

(ν).

• Set thresholds, τ

and τ

, for each minima se-

quence which are obtained using a simple statis-

tics: τ

= µ

− κδ

and τ

= µ

− κδ

, where µ

and δ

are the mean and the standard deviation of

the ﬁrst set of local minima, and µ

and δ

are

those of the second set of local minima while κ

is a positive value which depends on the dynamic

range of modiﬁed sequence y

(v).

• Set threshold coefﬁcient, κ, which is the same for

and τ

. As shown by Eq(5), κ is proportional to

global average of y

(v), and a is a constant value.

After experimenting with 10 reconstructed wave-

forms of different respiration types (stridor and

wheeze, normal tracheal breath for adult and in-

fant), a is found to be 3.4, and universal for all

types of tracheal breath sounds.

κ = a ×

N−1

∑

v=0

(v) (5)

• Declare a frame as an respiration frame if either

<τ

or α

<τ

. As it is mentioned earlier,

respiratory cycle is divided into four consecutive

phases: inspiratory phase, end-inspiratory pause,

expiratory phase, and end-expiratory pause. Res-

piration frames is deﬁned in this context as the

frames belong to either inspiratory or expiratory

phases. In this way, the respiration frame indices

are obtained from y

(ν) and y

(ν) as R and T :

R = {r

, r

, ..., r

} (6)

T = {t

, ...,t

} (7)

• Combine the two initial boundary decisions as fol-

lows:

C = R ∩ T (8)

where C ={c

, c

, ..., c

} is the set of elements

common to R and T . Considering that the mem-

bers of C are the indices of either inspiration or

expiration frames, the ﬁnal decision for detecting

respiration frames are obtained.

In the above, we decide that there exist respiration

frames whenever some or all of the prominent local

BIOSIGNALS 2008 - International Conference on Bio-inspired Systems and Signal Processing

118

minima obtained from the ﬁrst modiﬁed signal y

(ν)

would coincide with the local minima found from

the second modiﬁed signal y

(ν). For those detected

frames when their corresponding local minima are not

obtained from both modiﬁed sequences of y

(ν) and

(ν), are discarded as outliers.

3.2.2 Reﬁnement Scheme

Due to the quasi-stationary nature of the adventitious

respiratory sounds and their relatively small dynamic

range due to shallow breath, there are chances where

frames are wrongly identiﬁed because of the inﬂexi-

bility of the global threshold value used: small spikes

happen during end-inspiratory/expiratory pauses be-

ing wrongly identiﬁed as respiration segments which

are denoted by peaks; and small ﬂuctuations during

inspiration/expiration might be wrongly identiﬁed as

pause segments which are denoted by troughs as in-

dicated in Fig.1(c). In order to ensure the accuracy of

the segmentation, the results obtained from the pre-

liminary decision module will be ﬁne-tuned by the

reﬁnement scheme to avoid wrong identiﬁcation of

the respiratory frames. The scheme consists of two

stages:

• Identify error segments with durations shorter

than threshold σ

, where σ

varies for patients

with different respiratory rate. Since the duration

of end-inspiratory/expiratory pauses range from

0% to 30% and inspiration time range from 10%

to 80% of a complete breath cycle (Li, 2004), we

deﬁned error segment to be with duration less than

5% of individual’s averaged breath cycle. There-

fore σ

is deﬁned as:

= 5% ×

× F

(9)

where RR as Respiration Rate, is the number of

breath cycle per minute and F

is the sampling rate

of the signal. Since the averaged RR is the high-

est for infant which is 44 breaths/min (Keszler and

Abubakar, 2004), the scheme adopts this value to

minimize the wrong identiﬁcation. The selected

parameter values are listed in Table 1. The er-

ror segments are then divided into error respira-

tion segments and error pause segments, where

the number of segments for each error segment

type is counted.

Table 1: Values of parameters for reﬁnement scheme.

Parameter Value

8000 Hz

RR 44 breaths/min

545 samples

• Evaluate the error segments based on segment du-

ration. This process is applied for evaluating error

respiration segments ﬁrst. The procedure can be

described using our following pseudo code, where

respiration segment is denoted by R(s) and pause

segment by P(s) and s is the positional index of

the segment along time line.

Begin

T = threshold;

Pd(s) = duration of P(s);

Rd(s) = duration of R(s);

I = number of error R(s);

for i=1:I,

locate first error R(s);

if duration of Pd(s-1) & Pd(s) < T

if Pd(s) > Pd(s-1)

R(s) combine with R(s-1);

else

R(s) combine with R(s+1);

else if Pd(s-1) < T or Pd(s) < T

R(s) combine with R(s-1) or R(s+1);

else

R(s) is considered as pause segment;

end

End.

This procedure is then applied for the second time to

evaluate error pause segments by interchanging R(s)

with P(s) in the pseudo code.

4 EXPERIMENTAL RESULTS

4.1 Data and Parameter Selection

Five different types of tracheal sound signals are cho-

sen from (Lehrer, 1993) and (Wilkins et al., 2004).

Tracheal breath sound is chosen due to its relatively

larger amplitude compared with the sounds recorded

over chest. Also, it has distinct inspiratory/expiratory

phases and is related closely to respiratory ﬂow.

The segmentation algorithm has been tested on to-

tal 10 sound signals, each consists of 8 breathing cy-

cles. Four phases are distinct in every breathing cycle

for all signals chosen. Since the segmentation method

is working based on the overall trend instead of the

detail ﬂuctuations of the signals, the order m for re-

constructed signal y

(v) should be kept low. There-

fore, m = 3 is used in the experiments.

4.2 Illustrative Results and Analysis

Fig.1 illustrates the outputs of individual segmenta-

tion steps on a signal of inspiratory stridor and expi-

ratory moderate wheeze. Fig.1(a) shows the original

APPLICATION OF WALSH TRANSFORM BASED METHOD ON TRACHEAL BREATH SOUND SIGNAL

SEGEMENTATION

119

signal containing wheeze and stridor whereas Fig.1(b)

shows its transformed version, the reference modiﬁed

sequence y

(v), together with the reference threshold

. In Fig.1(c), output of preliminary decision mod-

ule is depicted. As indicated by arrows A, B, C, D,

there are 4 locations of preliminary results containing

error segments. Being optimized by the reﬁnement

scheme, the ﬁnal segmentation result is displayed in

Fig.1(d).

Also, the results for infant normal tracheal breath

are shown by Fig.2. By comparing these two ﬁgures,

no error segments are detected in Fig.2(c). This is

due to the different nature of the signals: The quasi-

stationary nature of wheeze and stridor signals gives

them more prominent components at low frequency,

while the fast transient nature of the normal tracheal

breath makes it emphasize more on the high fre-

quency components. Since y

(v) focusses on the sig-

nal trend which is represented by the low frequency

components, it captures more spikes (low frequency

details) for wheeze and stridor, but provides smoother

waveforms for normal breath sound signal. There-

fore, after thresholding by τ

, segments with short

duration are detected for abnormal breath sound sig-

nals. However, due to the optimization by reﬁnement

scheme, the ﬁnal segmentation results are equally ac-

curate for both normal tracheal breath sounds and ad-

ventitious breath sounds.

Moreover, illustrative results of the segmentation

algorithm for different types of respiratory sound sig-

nals are shown by Fig.3(a)-(e). These results demon-

strate the robustness of our proposed method on dif-

ferent types of tracheal breath.

5 DISCUSSION

In this paper, we have presented an algorithm to locate

and differentiate inspiratory/expiratory phases with

end-inspiratory/expiratory pauses for different types

of tracheal breath sounds. The use of binary Walsh

transform simpliﬁes the proposed algorithm to a large

extend and left only few parameters for adjustment.

This makes the algorithm fast and automatic even in

the absence of any a priori information of the input

signal types. It performs equally accurate for both

normal as well as adventitious sounds due to the in-

corporation of reﬁned decision module. Thus it is

more robust compared to existing methods as by using

these conventional methods, accurate segmentation is

still restricted within normal breath sounds.

As the only limitation, the proposed method does

not perform well on raw recorded tracheal breath

sound signals. This is due to the presence of the

2 4 6 8 10 12 14

x 10

−0.5

0.5

Original Expiratory Moderate Wheeze

2 4 6 8 10 12 14

x 10

0.5

0−order Modified Sequency with Threshold

2 4 6 8 10 12 14

x 10

0.5

Preliminary Segmentation Result

2 4 6 8 10 12 14

x 10

0.5

Samples

Final Segmentation Result

(a)

(b)

(c)

(d)

Figure 1: (a) Original signal waveform; (b)0-order modiﬁed

sequence y

(v) with threshold τ

; (c) preliminary segmenta-

tion result; (d) ﬁnal segmentation result for inspiratory stri-

dor and expiratory moderate wheeze.

0.5 1 1.5 2 2.5 3 3.5 4

x 10

−0.5

0.5

Original Normal Infant Tracheal Breath

0.5 1 1.5 2 2.5 3 3.5 4

x 10

0.5

0−order Modified Sequence with Threshold

0.5 1 1.5 2 2.5 3 3.5 4

x 10

0.5

Preliminary Segmentation Result

0.5 1 1.5 2 2.5 3 3.5 4

x 10

0.5

Samples

Final Segmentation Result

(a)

(b)

(c)

(d)

Figure 2: (a) Original signal waveform; (b)0-order modiﬁed

sequence y

(v) with threshold τ

; (c) preliminary segmen-

tation result; (d) ﬁnal segmentation result for infant normal

tracheal breath.

prominent heartbeat. Since the frequency range of

heartbeat is below 300Hz, it interferes with the nor-

mal breath sounds and contaminates the signal with

BIOSIGNALS 2008 - International Conference on Bio-inspired Systems and Signal Processing

120

large amount of low frequency components. This can

be solved by taking recording at positions with low

heart sound to respiratory sound amplitude ratio, or

preprocessing using a notch ﬁlter to suppress the ef-

fect of heartbeat. However, the algorithm is immune

to other ambient noises due to the wide spectrum oc-

cupied by the noises.

0 1 2 3 4 5 6 7 8

x 10

−0.4

−0.2

0.2

0.4

0.6

Samples

Normal Adult Tracheal Breath

0 0.5 1 1.5 2 2.5 3

x 10

−0.4

−0.2

0.2

0.4

0.6

Samples

Noraml Infant Tracheal Breath

3 3.5 4 4.5 5 5.5 6 6.5 7

x 10

−0.4

−0.2

0.2

0.4

0.6

Samples

Expiratory Mild Wheeze

0 1 2 3 4 5 6 7 8 9 10

x 10

−0.4

−0.2

0.2

0.4

0.6

Samples

Inspiratory Stridor and Expiratory Moderate Wheeze

0 1 2 3 4 5 6 7 8

x 10

−0.4

−0.2

0.2

0.4

0.6

Samples

Inspiratory Stridor and Expiratory Severe Wheeze

(a)

(b)

(c)

(d)

(e)

Figure 3: The segmentation results displayed with original

signal waveform for (a)-(b) normal tracheal breath of adult/

infant; (c) expiratory mild wheeze; (d)-(e) inspiratory stri-

dor and expiratory moderate/ severe wheeze.

REFERENCES

Arﬁb, D., Keiler, F., and Z

oler, U. (2002). DAFX - Digital

Audio Effects. John Wiley Publisher.

Beauchamp, K. G. (1984). Applications of Walsh and Re-

lated Functions. Academic Press.

Cort ´es, S., Jan ´e, R., Fiz, J. A., and Morera, J. (Sept, 2005).

Monitoring of wheeze duration during spontaneous

respiration in asthmatic patients. IEEE Proceedings

of Engineering in Medicine and Biology, pages 6141–

6144.

Golabbakhsh, M. (2004). Tracheal breath sound relation-

ship with respiratory ﬂow: Modeling, the effect of

age and airﬂow estimation. M. Sc. theses, Electrical

and Computer Engineering Department, University of

Manitoba.

Hossain, I. and Moussavi, Z. (2002). Respiratory air-

ﬂow estimation by acoustical means. [Engineering

in Medicine and Biology, 2002. 24th Annual Confer-

ence and the Annual Fall Meeting of the Biomedical

Engineering Society] EMBS/BMES Conference, 2002.

Proceedings of the Second Joint, 2.

Hult, P., Wranne, B., and Ask, P. (2000). A bioa-

coustic method for timing of the different phases of

the breathing cycle and monitoring of breathing fre-

quency. Medical Engineering and Physics, 22:425–

433.

Keszler, M. and Abubakar, K. (2004). Volume guarantee:

Stability of tidal volume and incidence of hypocarbia.

Pediatric Pulmonology, 38(3):240–245.

Lehrer, S. (1993). Understanding lung sounds, Audio CD.

Saunders.

Li, T. (2004). Invasive mechanical ventilation. Respiratory

problems: Invasive mechanical trainee manual.

Meslier, N., Charbonneau, G., and Racineux, J. L.

(1995). Wheezes. European Respiratory Journal,

8(11):1942–1948.

Quddus, A. and Gabbouj, M. (2002). Wavelet-based cor-

ner detection technique using optimal scale. Pattern

Recognition Letters, 23:215–220.

Sierra, G., Telfort, V., Popov, B., Durand, L. G., Agarwal,

R., and Lanzo, V. (2004). Monitoring respiratory rate

based on tracheal sounds. ﬁrst experiences. Annual

International Conference of the IEEE Engineering in

Medicine and Biology.

Sierra, G., Telfort, V., Popov, B., Pelletier, M., Despault, P.,

Agarwal, R., and Lanzo, V. (2005). Comparison of

respiratory rate estimation based on tracheal sounds

versus a capnograph. Annual International Confer-

ence of the IEEE Engineering in Medicine and Biol-

ogy.

Taplidou, S. A. and Hadjileontiadis, L. J. (2007). Nonlinear

analysis of wheezes using wavelet bicoherence. Com-

puters in Biology and Medicine, 37:563–570.

Wilkins, R. L., Hodgkin, J. E., and Lopez, B. (2004).

Fundamentals of lung and heart sounds, Audio CD.

Mosby.

Yadollahi, A. and Moussavi, Z. (2006). A robust method for

estimating respiratory ﬂow using tracheal sounds en-

tropy. IEEE Transactions on Biomedical Engineering,

53(4):662–668.

APPLICATION OF WALSH TRANSFORM BASED METHOD ON TRACHEAL BREATH SOUND SIGNAL

SEGEMENTATION

121