Spatial extrapolation of early room impulse responses in local area using sparse equivalent sources and image source method
Introduction
Measurement of room impulse response (RIR) is essential to understand sound propagation in a room. For example, the RIRs at multiple points are useful for sound field control/synthesis [4], [23], [6], and visualization of sound field, etc [10], [29]. Sound propagation is considered as a linear time-invariant system, and the system considers the loudspeaker as the input and the microphone as the output. The RIR depends on the positions of the loudspeaker and the microphone because the sound propagation path can change based on those positions. Thus, to obtain the RIRs from the loudspeaker to other points, we must repeat measurements after relocating the microphone. Another approach is to use a microphone array that has multiple microphones located at all measurement points. However, as the number of microphones increases, the sound reflections from the microphones become larger and the calibration of the microphones become more complicated.
In recent studies, several RIR interpolation methods have been proposed that are able to measure multiple RIRs efficiently. In [19], proposed an RIR interpolation method for an entire room using the sparsity of the early reflections in the time domain.
In [12], [13], the RIRs at grid points were recovered by solving the inverse problem of interpolating the signal of the microphone on a moving path from RIRs at the grid points.
Alternatively, the equivalent source method (ESM) is well known for the reconstruction of the radiated and scattered sound fields [14], [11], [27]. Based on the Kirchhoff–Helmholtz integral equation [30], the sound field can be represented by the superposition of point sources surrounding the domain of interest. Because the number of point sources exceeds the number of measurement points, it is treated as an undecided problem and is solved by the least squares (LS) method.
With the recent developments in compressed sensing [3], a sparse representation of the acoustic field has been shown to be effective for some applications, such as multi-sound field control [21], [22], sound field decomposition [15], and RIR interpolation [19], [2]. In ESM, the sparse expression is also more effective than the LS method to represent the near field [5].
In addition, the extrapolation method helps us to obtain the RIRs more efficiently compared to the interpolation, since it is simple and easy to place the microphones. In [9], the room transfer function is modeled by the common acoustical poles and their residues corresponding to the eigenfrequencies of the room. In [17], neural networks are applied for the sound field reconstruction, and sound field variations are predicted based on the observations by a small number of microphones. Furthermore, in the work presented in [28], the RIRs at arbitrary points were extrapolated by a limited number of plane waves using compressed sensing. These previous studies have shown the effectiveness of RIR extrapolation at frequencies below 1 kHz. However, extrapolation of RIRs at higher frequencies is required for various applications such as sound field control and concert hall design. We previously proposed extrapolation methods for direct sound and primary reflections based on sparse ESM and image source method [24], [25], and evaluated the effectiveness of our proposed method by simulation experiments.
In this study, to estimate the sound reflections around the microphones using the decomposition of the reflection components on each wall, we propose the extrapolation method of RIRs including not only primary reflection but also the other early reflections. In the measurement experiment in an anechoic chamber, we evaluate the estimation accuracies at frequencies up to 8.5 kHz.
Our proposed method can determine the reflection components for each wall using the image source method [1]. This decomposition can help in various applications such as sound visualization and room acoustics design in architectural acoustics, based on the relationship between each reflection component and the reflecting object. Furthermore, since it is known that the early part of RIRs affects the timbre and sound localization [7]; therefore, it is necessary to design and control the early part of RIRs in most acoustic applications. Thus, in this study, we focus on extrapolation of the early part of RIRs in the frequency band 0.5–8.5 kHz. We conduct an experimental evaluation with sound-reflecting boards in an anechoic chamber. We evaluate extrapolation accuracies of RIRs including primary and secondary sound reflections with two different configurations of the microphone array.
The outline of this paper is as follows: In Section 2, we present the estimation method of RIRs with sparse ESM and the image source method. In Section 3, experimental results are reported to evaluate the proposed method. Finally, we conclude this paper in Section 4.
Section snippets
Method
Fig. 1 shows the concept of the proposed method with sparse equivalent sources and the image source method. We consider RIRs comprising of the direct sound from a loudspeaker and its early reflections from the walls. First, based on the superposition principle for sound waves, the transfer function of the m-th microphone from a loudspeaker to a position of a microphone is divided into a direct sound and early reflections as
Experimental conditions
We conducted the evaluation experiments in an anechoic chamber with sound-reflecting boards. In these experiments, we evaluate the extrapolation accuracy of the transfer functions that comprise direct sound and primary/secondary reflections in the horizontal plane. We compared the proposed method with the plane wave decomposition (PWD) [28] for extrapolation. In PWD, the optimization problem in Eq. (8) was solved with transfer functions of plane waves and error tolerance was . The number of
Conclusion
We proposed the spatial extrapolation method of the early room impulse responses with a small number of microphones to efficiently obtain RIRs in the local area. We estimated components of the direct sound and the reflection from walls using the sparse equivalent sources and the image source method.
The experimental results indicate that RIRs, including primary and secondary reflections, can be estimated over 0.5–8.5 kHz in 0.54 × 0.6 using 13 or 16 microphones. For both microphone arrays,
CRediT authorship contribution statement
Izumi Tsunokuni: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization, Project administration. Kakeru Kurokawa: Validation, Investigation. Haruka Matsuhashi: Software, Validation, Investigation. Yusuke Ikeda: Conceptualization, Methodology, Software, Validation, Resources, Writing - review & editing, Supervision, Funding acquisition. Naotoshi Osaka: Resources, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported by Research Institute for Science and Technology of Tokyo Denki University Grant No. Q20J-04/ Japan. The authors would like to thank Prof. Y. Kaneda (Tokyo Denki University) for support regarding the measurement environment and equipment.
References (30)
- et al.
Image method for efficiently simulating small-room acoustics
J Acoust Soc Amer
(1979) - et al.
Room impulse response interpolation using a sparse spatio-temporal representation of the sound field
IEEE/ACM Trans Audio Speech Lang Process
(2017) - et al.
An introduction to compressive sampling
IEEE Signal Process Mag
(2008) - et al.
Multiple-point equalization in a room using adaptive digital filters
J Audio Eng Soc
(1989) - et al.
A sparse equivalent source method for near-field acoustic holography
J Acoust Soc Amer
(2017) - et al.
Adaptive wave field synthesis with independent radiation mode control for active sound field reproduction: Theory
J Acoust Soc Amer
(2006) - et al.
A consideration of distance perception in binaural hearing
J Acoust Soc Jpn
(1977) - Grant M, Boyd S. CVX: Matlab software for disciplined convex programming, version 2.1; 2020. URL:...
- et al.
Common-acoustical-pole and residue model and its application to spatial interpolation and extrapolation of a room transfer function
IEEE Trans Speech Audio Process
(1999) - et al.
Visualization system for sound field using see-through head-mounted display
Acoust Sci Technol
(2019)
An equivalent source technique for calculating the sound field inside an enclosure containing scattering objects
J Acoust Soc Amer
Sound-field measurement with moving microphones
J Acoust Soc Amer
A compressed sensing framework for dynamic sound-field measurements
IEEE/ACM Trans Audio Speech Lang Process
A method for computing acoustic fields based on the principle of wave superposition
J Acoust Soc Amer
Sparse representation of a spatial sound field in a reverberant environment
IEEE J Select Top Signal Process
Cited by (15)
Multizone sound field reproduction using pressure matching with sparse equivalent source
2024, Journal of Sound and VibrationSound field reconstruction using neural processes with dynamic kernels
2024, Eurasip Journal on Audio, Speech, and Music ProcessingOptimal Transport Based Impulse Response Interpolation in the Presence of Calibration Errors
2024, IEEE Transactions on Signal Processing