Multi-channel lung sound classification with convolutional recurrent neural networks

https://doi.org/10.1016/j.compbiomed.2020.103831Get rights and content

Highlights

  • Multi-channel lung sound classification with Convolutional Recurrent Neural Networks.

  • Multi-channel lung sound database.

  • Comparison of different neural network architectures.

Abstract

In this paper, we present an approach for multi-channel lung sound classification, exploiting spectral, temporal and spatial information. In particular, we propose a frame-wise classification framework to process full breathing cycles of multi-channel lung sound recordings with a convolutional recurrent neural network. With our recently developed 16-channel lung sound recording device, we collect lung sound recordings from lung-healthy subjects and patients with idiopathic pulmonary fibrosis (IPF), within a clinical trial. From the lung sound recordings, we extract spectrogram features and compare different deep neural network architectures for binary classification, i.e. healthy vs. pathological. Our proposed classification framework with the convolutional recurrent neural network outperforms the other networks by achieving an F-score of F192%. Together with our multi-channel lung sound recording device, we present a holistic approach to multi-channel lung sound analysis.

Introduction

Commercially available lung sound recording devices, i.e. electronic stethoscopes, are limited to single-channel auscultation/recording. In lung auscultation, abnormal/adventitious lung sounds are heard over affected areas, hence, several successive recording positions over the chest are needed to cover the whole organ. As an example, adventitious sounds caused by early stage idiopathic pulmonary fibrosis (IPF) are fine (or velcro) crackles during inspiration, heard over basal areas of the lung [1], [2]. Simultaneous (multi-channel) recording at several positions provides additional information, especially in combination with computer-aided lung sound analysis, which increases the potential of acoustic lung diagnostic.

Several approaches to multi-channel lung sound classification exist. Several research groups record lung sounds independently with different recording setups, i.e. differing in design, and number and position of sensors. A first approach to multi-channel lung sound analysis was the STG16 [3]. It enables 14-channel lung sound recording on the posterior chest, with two additional channels for the locations trachea and heart. Algorithms enable the detection and localization of different adventitious sounds. Another multi-channel recording device with 14-channel lung sound recording on the posterior chest, however, with a different sensor arrangement than the STG16 [3], is presented in [4]. The authors of [4] explore a useful methodology for the classification of the three-class structure (healthy-obstructive–restrictive) in [5]. They model 14-channel pulmonary sound data using a second order vector autoregressive (VAR) model, and feed the estimated model parameter to support vector machine (SVM) and Gaussian mixture model (GMM) classifiers. A 25-channel lung sound recording device is used in [6], with a 5 × 5 sensor array attached on the posterior chest. The authors assess different parameterization techniques for multi-channel lung sounds for two-class classification (normal versus abnormal), such as power spectral density (PSD), the eigenvalues of the covariance matrix, the univariate autoregressive model (UAR), and the multivariate autoregressive model (MAR). Those methods are applied to construct feature vectors used as input to a supervised multilayer neural network. Furthermore, the respiratory sound database from the ICBHI 2017 Challenge also partially consists of multi-channel lung sound recordings [7]. These multi-channel recordings were collected with either seven stethoscopes (3M Littman Classic II SE) with a microphone in the main tube or seven air-coupled electret microphones (C 417 PP, AKG Acoustics) located into capsules made of Teflon. Whether single- or multi-channel processing, three different levels of adventitious sound analysis can be performed [8]: Detection  and  classification  of  adventitious sounds at a segment level (i.e. by signal windowing segments are generated, features are extracted, and with random segments of adventitious and normal sounds classification is performed), classification at the event level (of manually isolated events of adventitious and normal lung sounds), and event detection at recording level. Within this work, we focus on the classification of isolated (multi-channel) lung sound recordings with one full breathing cycle each.

Multilayer perceptrons (MLPs), also called feedforward neural networks (FNNs), are widely used for lung sound classification [6], [9], [10]. Although MLPs are very powerful, they do not model temporal context explicitly. To process sequential input of variable length and learn temporal dependencies within the data, recurrent neural networks (RNNs) are suitable architectures [11], [12]. They show state-of-the-art performance in several audio classification tasks, such as speech recognition [12], acoustic event detection and scene classification [13], heart sound classification [14], and are already introduced to event detection in and classification of lung sounds [15], [16], [17], [18]. Another powerful neural network architecture are convolutional neural networks [19]. They are widely applied to audio classification tasks, including lung sound classification [20], [21]. Convolutional neural networks can be used as feature extractors by directly applying them to raw audio waveforms [22], [23]. Another approach is the usage after feature extraction, e.g. by processing spectrograms [24].

In this paper, we exploit spectral, temporal and spatial information for multi-channel lung sounds classification. To this end, we present a multi-channel lung sound classification framework with convolutional recurrent neural networks (CRNN) [24], [25], a combination of convolutional neural networks (CNNs) [19] and recurrent neural networks (RNNs) [11], [12]. For this purpose, we firstly conducted a clinical trial to record a lung sound database with our recently developed multi-channel lung sound recording device [26]. The device enables the recording of lung sounds at 16 positions over the posterior chest. The fixed pattern for the lung sound transducer arrangement of the recording front-end results in varying recording positions depending on the subjects physique. Our proposed classification framework is an approach to render exact recording positions dispensable. We evaluate the proposed method for the diagnosis of idiopathic pulmonary fibrosis in our experiments. The simplified overall processing framework is illustrated in Fig. 1.

Our main contributions and results can be summarized as follows:

  • For the first time, we introduce a classification framework with CRNNs for multi-channel lung sound recordings. In particular, we present a specific network architecture, which conveniently allows to exploit spectral, temporal, and spatial (i.e. multi-channel) information from multi-channel lung sounds.

  • We conducted a clinical trial to record a multi-channel lung sound database with healthy and pathological (i.e. idiopathic pulmonary fibrosis) subjects.

  • We present experimental results, where we compare different neural network architectures for classification.

  • Together with our multi-channel lung sound recording device, we present a holistic approach to multi-channel lung sound analysis/classification.

The paper is structured as follows: In Section 2, we discuss different neural network architectures, including MLPs, RNNs, and CNNs. In Section 3, we present our custom-build multi-channel lung sound recording device. In Section 4, we present our proposed multi-channel classification framework, the recorded multi-channel lung sound database, the experimental setup (including the evaluation metrics), and the experimental results. Finally, we discuss our findings in Section 5 and conclude the paper in Section 6.

Section snippets

Multilayer perceptron

Multilayer perceptrons (MLPs) [27] are the simplest type of artificial neural networks. In an MLP, information flows forward through the network, i.e. the output of the model is not fed back into itself. A special kind of MLPs are CNNs (see Section 2.3). Extensions of MLPs with feedback connections are RNNs (see Section 2.2).

Eqs. (1)–(2) describe the MLP mathematically. hfl=g(Wxlxfl+bhl) yf=m(WyhfL1+by)It consists of several layers L, with l {1,,L1} being the index of the hidden layers.

Multi-channel lung sound recording device

With the endeavor of bringing automatic lung sound analysis a step closer to clinical practice, we started with the development of a lung sound recording device [26]. We revealed several limitations in existing methods for (multi-channel) lung sound recording hardware, resulting in the following aspects mainly influencing our hardware design: ease of use, high signal quality, robustness against air- and body-borne noise, multi-channel recording, and airflow-awareness.1

Multi-channel classification framework

The proposed classification framework processes multi-channel lung sound recordings of one breathing cycle each.

Discussion

In our experiments, we compare different neural network architectures for multi-channel lung sound classification. Firstly, we determine a suitable network size for each architecture using grid search. We compare the architectures of the MLP, the BiGRNN, and the ConvBiGRNN, with the latter outperforming the rest.

As initially described (see Section 2), adventitious sounds caused by IPF are inspiratory fine (or velcro) crackles heard over affected areas [1], [2]. Because adventitious sounds are

Conclusion

In this paper, we introduce convolutional recurrent neural networks to multi-channel lung sound classification. To this end, we recorded a small lung sound database with our recently developed multi-channel lung sound recording device, including lung-healthy subjects and patients diagnosed with idiopathic pulmonary fibrosis (IPF). With the acquired data, we perform experiments to evaluate the classification performance of our proposed method, including the comparison with different neural

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This project was supported by the government of Styria, Austria, under the project call HTI:Tech for Med and by the Austrian Science Fund (FWF) under the project number P27803-N15.

We acknowledge 3MTM for providing Littmann® stethoscope chest pieces, Schiller AG for the support with a spirometry solution, and NVIDIA for providing GPU computing resources. We thank the Clinical Trial Coordination Centre of the Medical University of Graz for project management support and monitoring.

References (47)

  • SenI. et al.

    A comparison of SVM and GMM-based classifier configurations for diagnostic classification of pulmonary sounds

    IEEE Trans. Biomed. Eng.

    (2015)
  • RochaB. et al.

    A respiratory sound database for the development of automated classification

  • PramonoR.X.A. et al.

    Automatic adventitious respiratory sound analysis: A systematic review

    PLoS One

    (2017)
  • Orjuela-CañónA.D. et al.

    Artificial neural networks for acoustic lung signals classification

  • SutskeverI. et al.

    Sequence to sequence learning with neural networks

  • GravesA. et al.

    Hybrid speech recognition with deep bidirectional LSTM

  • ZöhrerM. et al.

    Gated recurrent networks applied to acoustic scene classification and acoustic event detection

    IEEE AASP Chall.: Detect. Classif. Acoustic Scenes Events

    (2016)
  • MessnerE. et al.

    Heart sound segmentation-an event detection approach using deep recurrent neural networks

    IEEE Trans. Biomed. Eng.

    (2018)
  • E. Messner, M. Fediuk, P. Swatek, S. S., F.-M. Smolle-Jüttner, H. Olschewski, F. Pernkopf, Crackle and breathing phase...
  • KochetovK. et al.

    Noise masking recurrent neural network for respiratory sound classification

  • PernaD. et al.

    Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks

  • LeCunY. et al.

    Convolutional networks for images, speech, and time series

    Handb. Brain Theory Neural Netw.

    (1995)
  • AykanatM. et al.

    Classification of lung sounds using convolutional neural networks

    EURASIP J. Image Video Process.

    (2017)
  • Cited by (55)

    View all citing articles on Scopus
    View full text