Elsevier

Applied Acoustics

Volume 179, August 2021, 108027
Applied Acoustics

Spatial extrapolation of early room impulse responses in local area using sparse equivalent sources and image source method

https://doi.org/10.1016/j.apacoust.2021.108027Get rights and content

Abstract

The room impulse response (RIR) is important in most acoustic applications, such as the design of concert halls and sound field control, because it characterizes the sound propagation. The measurement of RIRs at multiple points is challenging, as it requires a huge microphone array or repeating the experiment by microphone replacement. Several RIR interpolation and extrapolation methods of RIRs have been developed for obtaining RIRs from multiple measurement points efficiently. Extrapolation methods offer more efficient RIR measurement compared to interpolation. However, previous studies focused on extrapolation at frequencies below 1 kHz, and the extrapolation at higher frequencies was difficult. In this study, we propose an extrapolation method for RIRs of direct sound and early reflections in a local area using a small number of measurement points. The proposed method represents the RIRs around the microphones using superpositions of sparse equivalent sources located around the loudspeaker and image sources. We conducted a measurement experiment in an anechoic chamber to estimate RIRs around microphones using sound-reflecting boards. From the experimental results with 2.5 dimensional conditions, the proposed method achieved about above 10 dB of signal to noise ratio (SNR) near the microphone array from 0.5–8.5 kHz. For the extrapolation accuracy over the entire evaluation area (0.6 × 0.54 m2), the proposed method improved the SNR by about 5–6 dB compared to results using the plane wave decomposition.

Introduction

Measurement of room impulse response (RIR) is essential to understand sound propagation in a room. For example, the RIRs at multiple points are useful for sound field control/synthesis [4], [23], [6], and visualization of sound field, etc [10], [29]. Sound propagation is considered as a linear time-invariant system, and the system considers the loudspeaker as the input and the microphone as the output. The RIR depends on the positions of the loudspeaker and the microphone because the sound propagation path can change based on those positions. Thus, to obtain the RIRs from the loudspeaker to other points, we must repeat measurements after relocating the microphone. Another approach is to use a microphone array that has multiple microphones located at all measurement points. However, as the number of microphones increases, the sound reflections from the microphones become larger and the calibration of the microphones become more complicated.

In recent studies, several RIR interpolation methods have been proposed that are able to measure multiple RIRs efficiently. In [19], proposed an RIR interpolation method for an entire room using the sparsity of the early reflections in the time domain.

In [12], [13], the RIRs at grid points were recovered by solving the inverse problem of interpolating the signal of the microphone on a moving path from RIRs at the grid points.

Alternatively, the equivalent source method (ESM) is well known for the reconstruction of the radiated and scattered sound fields [14], [11], [27]. Based on the Kirchhoff–Helmholtz integral equation [30], the sound field can be represented by the superposition of point sources surrounding the domain of interest. Because the number of point sources exceeds the number of measurement points, it is treated as an undecided problem and is solved by the least squares (LS) method.

With the recent developments in compressed sensing [3], a sparse representation of the acoustic field has been shown to be effective for some applications, such as multi-sound field control [21], [22], sound field decomposition [15], and RIR interpolation [19], [2]. In ESM, the sparse expression is also more effective than the LS method to represent the near field [5].

In addition, the extrapolation method helps us to obtain the RIRs more efficiently compared to the interpolation, since it is simple and easy to place the microphones. In [9], the room transfer function is modeled by the common acoustical poles and their residues corresponding to the eigenfrequencies of the room. In [17], neural networks are applied for the sound field reconstruction, and sound field variations are predicted based on the observations by a small number of microphones. Furthermore, in the work presented in [28], the RIRs at arbitrary points were extrapolated by a limited number of plane waves using compressed sensing. These previous studies have shown the effectiveness of RIR extrapolation at frequencies below 1 kHz. However, extrapolation of RIRs at higher frequencies is required for various applications such as sound field control and concert hall design. We previously proposed extrapolation methods for direct sound and primary reflections based on sparse ESM and image source method [24], [25], and evaluated the effectiveness of our proposed method by simulation experiments.

In this study, to estimate the sound reflections around the microphones using the decomposition of the reflection components on each wall, we propose the extrapolation method of RIRs including not only primary reflection but also the other early reflections. In the measurement experiment in an anechoic chamber, we evaluate the estimation accuracies at frequencies up to 8.5 kHz.

Our proposed method can determine the reflection components for each wall using the image source method [1]. This decomposition can help in various applications such as sound visualization and room acoustics design in architectural acoustics, based on the relationship between each reflection component and the reflecting object. Furthermore, since it is known that the early part of RIRs affects the timbre and sound localization [7]; therefore, it is necessary to design and control the early part of RIRs in most acoustic applications. Thus, in this study, we focus on extrapolation of the early part of RIRs in the frequency band 0.5–8.5 kHz. We conduct an experimental evaluation with sound-reflecting boards in an anechoic chamber. We evaluate extrapolation accuracies of RIRs including primary and secondary sound reflections with two different configurations of the microphone array.

The outline of this paper is as follows: In Section 2, we present the estimation method of RIRs with sparse ESM and the image source method. In Section 3, experimental results are reported to evaluate the proposed method. Finally, we conclude this paper in Section 4.

Section snippets

Method

Fig. 1 shows the concept of the proposed method with sparse equivalent sources and the image source method. We consider RIRs comprising of the direct sound from a loudspeaker and its early reflections from the walls. First, based on the superposition principle for sound waves, the transfer function ym(C) of the m-th microphone from a loudspeaker xsrc to a position of a microphone xm(m=1,,M) is divided into a direct sound and early reflections asym=ym,0(0)+ym,1(1)++ym,I1(1)+ym,1(2)++ym,I2(2)

Experimental conditions

We conducted the evaluation experiments in an anechoic chamber with sound-reflecting boards. In these experiments, we evaluate the extrapolation accuracy of the transfer functions that comprise direct sound and primary/secondary reflections in the horizontal plane. We compared the proposed method with the plane wave decomposition (PWD) [28] for extrapolation. In PWD, the optimization problem in Eq. (8) was solved with transfer functions of plane waves and error tolerance was =2. The number of

Conclusion

We proposed the spatial extrapolation method of the early room impulse responses with a small number of microphones to efficiently obtain RIRs in the local area. We estimated components of the direct sound and the reflection from walls using the sparse equivalent sources and the image source method.

The experimental results indicate that RIRs, including primary and secondary reflections, can be estimated over 0.5–8.5 kHz in 0.54 × 0.6 m2 using 13 or 16 microphones. For both microphone arrays,

CRediT authorship contribution statement

Izumi Tsunokuni: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization, Project administration. Kakeru Kurokawa: Validation, Investigation. Haruka Matsuhashi: Software, Validation, Investigation. Yusuke Ikeda: Conceptualization, Methodology, Software, Validation, Resources, Writing - review & editing, Supervision, Funding acquisition. Naotoshi Osaka: Resources, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by Research Institute for Science and Technology of Tokyo Denki University Grant No. Q20J-04/ Japan. The authors would like to thank Prof. Y. Kaneda (Tokyo Denki University) for support regarding the measurement environment and equipment.

References (30)

  • J.B. Allen et al.

    Image method for efficiently simulating small-room acoustics

    J Acoust Soc Amer

    (1979)
  • N. Antonello et al.

    Room impulse response interpolation using a sparse spatio-temporal representation of the sound field

    IEEE/ACM Trans Audio Speech Lang Process

    (2017)
  • E.J. Candes et al.

    An introduction to compressive sampling

    IEEE Signal Process Mag

    (2008)
  • S.J. Elliott et al.

    Multiple-point equalization in a room using adaptive digital filters

    J Audio Eng Soc

    (1989)
  • E. Fernandez-Grande et al.

    A sparse equivalent source method for near-field acoustic holography

    J Acoust Soc Amer

    (2017)
  • P.A. Gauthier et al.

    Adaptive wave field synthesis with independent radiation mode control for active sound field reproduction: Theory

    J Acoust Soc Amer

    (2006)
  • T. Gotoh et al.

    A consideration of distance perception in binaural hearing

    J Acoust Soc Jpn

    (1977)
  • Grant M, Boyd S. CVX: Matlab software for disciplined convex programming, version 2.1; 2020. URL:...
  • Y. Haneda et al.

    Common-acoustical-pole and residue model and its application to spatial interpolation and extrapolation of a room transfer function

    IEEE Trans Speech Audio Process

    (1999)
  • A. Inoue et al.

    Visualization system for sound field using see-through head-mounted display

    Acoust Sci Technol

    (2019)
  • M.E. Johnson et al.

    An equivalent source technique for calculating the sound field inside an enclosure containing scattering objects

    J Acoust Soc Amer

    (1998)
  • F. Katzberg et al.

    Sound-field measurement with moving microphones

    J Acoust Soc Amer

    (2017)
  • F. Katzberg et al.

    A compressed sensing framework for dynamic sound-field measurements

    IEEE/ACM Trans Audio Speech Lang Process

    (2018)
  • G.H. Koopmann et al.

    A method for computing acoustic fields based on the principle of wave superposition

    J Acoust Soc Amer

    (1989)
  • S. Koyama et al.

    Sparse representation of a spatial sound field in a reverberant environment

    IEEE J Select Top Signal Process

    (2019)
  • Cited by (15)

    • Sound field reconstruction using neural processes with dynamic kernels

      2024, Eurasip Journal on Audio, Speech, and Music Processing
    View all citing articles on Scopus
    View full text