Asynchronous microphone arrays calibration and sound source tracking

Su, Daobilige; Vidal-Calleja, Teresa; Miro, Jaime Valls

doi:10.1007/s10514-019-09885-w

Asynchronous microphone arrays calibration and sound source tracking

Published: 31 August 2019

Volume 44, pages 183–204, (2020)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

776 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, we proposed an optimisation method to solve the problem of sound source localisation and calibration of an asynchronous microphone array. This method is based on the graph-based formulation of the simultaneous localisation and mapping problem. In this formulation, a moving sound source is considered to be observed from a static microphone array. Traditional approaches for sound source localisation rely on the well-known geometrical information of the array and synchronous readings of the audio signals. Recent work relaxed these two requirements by estimating the temporal offset between pair of microphones based on the assumption that the clock timing of each microphone is exactly the same. This assumption requires the sound cards to be identically manufactured, which in practice is not possible to achieve. Hereby an approach is proposed to jointly estimate the array geometrical information, time offset and clock difference/drift rate of each microphone together with the location of a moving sound source. In addition, an observability analysis of the system is performed to investigate the most suitable configuration for sound source localisation. Simulation and experimental results are presented, which prove the effectiveness of the proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions

Generalization of the Standard Beamforming Algorithm for Identifying Acoustic Sources Using Asynchronous Microphone Array Measurements

Article 07 April 2022

Multisensor Acoustical Systems: Calibration and Related Problems

Notes

Note that the main difference with a standard landmark-pose SLAM system is that here all the microphones are “observed” at all times. In a standard SLAM system only part of the landmarks are observed at any time. This fact allows the microphone array to be treated as a single landmark with a large state that contains locations of all microphones. However, the same solution can be achieved if the microphones are considered independently.
Note that other motion models, such as constant velocity model, can be applied as long as it describes the motion of the sound source properly.
Note that uncertainties of microphone locations do not directly related to minimum eigenvalues of sub FIMs corresponding to individual microphones, due to: (1) Minimum eigenvalues of sub FIMs corresponding to individual microphones related to not only locations but also starting time offsets and clock differences of microphones. (2) Uncertainties of microphone locations depend on relative distance to microphone 1, which is fixed to the origin of the coordinates as a reference. This means microphones close to microphone 1 have smaller uncertainties on their estimated locations. In comparison, minimum eigenvalues of sub FIMs corresponding to individual microphones do not depend on distance to microphone 1.

References

Bar-Shalom, Y., Li, X. R., & Kirubarajan, T. (2004). Estimation with applications to tracking and navigation: Theory algorithms and software. New York: Wiley.
Google Scholar
Blandin, C., Ozerov, A., & Vincent, E. (2012). Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Processing, 92(8), 1950–1960.
Article Google Scholar
Bove, J., Michael, V., & Dalton, B. (2005). Audio-based self-localization for ubiquitous sensor networks. In Audio Engineering Society Convention (p. 118).
Canclini, A., Antonacci, E., Sarti, A., & Tubaro, S. (2013). Acoustic source localization with distributed asynchronous microphone networks. IEEE Transactions on Audio, Speech and Signal Processing, 21(2), 439–443.
Article Google Scholar
Fan, H. H., & Yan, C. (2007). Asynchronous differential TDOA for sensor self-localization. In IEEE International Conference on Acoustics, Speech and Signal Processing 2007 (ICASSP 2007) (pp. II-1109–II-1112).
Grisetti, G., Kummerle, R., Stachniss, C., & Burgard, W. (2010). A tutorial on graph-based SLAM. IEEE Intelligent Transportation Systems Magazine, 2(4), 31–43.
Article Google Scholar
Hasegawa, K., Ono, N., Miyabe, S., & Sagayama, S. (2010). Blind estimation of locations and time offsets for distributed recording devices. In Latent variable analysis and signal separation (pp. 57–64).
Hennecke, M. H., & Fink, G. A. (2011). Towards acoustic self-localization of ad hoc smartphone arrays. In 2011 joint workshop on hands-free speech communication and microphone arrays (HSCMA) (pp. 127–132).
Huang, S., & Dissanayake, G. (2016). A critique of current developments in simultaneous localization and mapping. International Journal of Advanced Robotic Systems. https://doi.org/10.1177/1729881416669482.
Article Google Scholar
Lombard, A., Zheng, Y., Buchner, H., & Kellermann, W. (2011). TDOA estimation for multiple sound sources in noisy and reverberant environments using broadband independent component analysis. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 1490–1503.
Article Google Scholar
Miura, H., Yoshida, T., Nakamura, K., & Nakadai, K. (2011). SLAM-based online calibration of asynchronous microphone array for robot audition. In IEEE/RSJ international conference on intelligent robots and systems (IROS 2011) (pp. 524–529).
Nakadai, K., Yamamoto, S., Okuno, H. G., Nakajima, H., Hasegawa, Y., & Tsujino, H. (2008). A robot referee for rock-paper-scissors sound games. In The 2008 IEEE international conference on robotics and automation (ICRA 2008) (pp. 3469–3474).
Nakadai, K., Okuno, H. G., & Mizumoto, T. (2017). Development, deployment and applications of robot audition open source software HARK. Journal of Robotics and Mechatronics, 29(1), 16–25.
Article Google Scholar
Nakamura, K., Nakadai, K., Asano, F., & Ince, G. (2011). Intelligent sound source localization and its application to multimodal human tracking. In IEEE/RSJ international conference on intelligent robots and systems (IROS 2011) (pp. 143–148).
Nedis. (2019). 3D sound USB adapter. https://www.nedis.com/product/sound-card-3d-sound-51-usb-20-double-35-mm-connector. Accessed 10 May 2019.
Nesta, F., & Omologo, M. (2012). Generalized state coherence transform for multidimensional TDOA estimation of multiple sources. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 246–260.
Article Google Scholar
Ohata, T., Nakamura, K., Nagamine, A., Mizumoto, T., Ishizaki, T., Kojima, R., et al. (2017). Outdoor sound source detection using a quadcopter with microphone array. Journal of Robotics and Mechatronics, 29(1), 177–187.
Article Google Scholar
Ono, N., Shibata, K., & Kameoka, H. (2016). Self-localization and channel synchronization of smartphone arrays using sound emissions. In 2016 Asia–Pacific signal and information processing association annual summit and conference (APSIPA 2016) (pp. 1–5).
Pertila, P., Mieskolainen, M., & Hamalainen, M. S. (2011). Closed-form self-localization of asynchronous microphone arrays. In 2011 joint workshop on hands-free speech communication and microphone arrays (HSCMA) (pp. 139–144).
Plinge, A., Jacob, F., Haeb-Umbach, R., & Fink, G. A. (2016). Acoustic microphone geometry calibration: An overview and experimental evaluation of state-of-the-art algorithms. IEEE Signal Processing Magazine, 33(4), 14–29.
Article Google Scholar
Plinge, A., Fink, G. A., & Gannot, S. (2017). Passive online geometry calibration of acoustic sensor networks. IEEE Signal Processing Letters, 24(3), 324–328.
Article Google Scholar
Raykar, V. C., Yegnanarayana, B., Prasanna, S., & Duraiswami, R. (2005). Speaker localization using excitation source information in speech. IEEE Transactions on Speech and Audio Processing, 13(5), 751–761.
Article Google Scholar
Samueli, H. (1988). On the design of optimal equiripple FIR digital filters for data transmission applications. IEEE Transactions on Circuits and Systems, 35(12), 1542–1546.
Article MathSciNet Google Scholar
Sekiguchi, K., Bando, Y., Nakamura, K., Nakadai, K., Itoyama, K., & Yoshii, K. (2016). Online simultaneous localization and mapping of multiple sound sources and asynchronous microphone arrays. In IEEE/RSJ international conference on intelligent robots and systems 2016 (IROS 2016) (pp. 1973–1979).
Su, D., Vidal-Calleja, T., & Valls Miro, J. (2015). Simultaneous asynchronous microphone array calibration and sound source localisation. In IEEE/RSJ international conference on intelligent robots and systems 2015 (IROS 2015) (pp. 5561–5567).
Valin, J. M., Rouat, J., & Michaud, F. (2004). Enhanced robot audition based on microphone array source separation with post-filter. In IEEE/RSJ international conference on intelligent robots and systems 2014 (IROS 2014) (pp. 2123–2128).
Wang, Z., & Dissanayake, G. (2008). Observability analysis of SLAM using Fisher information matrix. In The 2008 10th international conference on control, automation, robotics and vision (ICARCV 2008) (pp. 1242–1247).
Yamamoto, S., Valin, J. M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., & Okuno, H. G. (2005). Enhanced robot speech recognition based on microphone array source separation and missing feature theory. In The 2005 IEEE international conference on robotics and automation (ICRA 2005) (pp. 1477–1482).

Download references

Author information

Authors and Affiliations

Centre for Autonomous Systems (CAS), University of Technology, Sydney (UTS), Sydney, Australia
Daobilige Su, Teresa Vidal-Calleja & Jaime Valls Miro
Australian Centre for Field Robotics (ACFR), The University of Sydney, Sydney, Australia
Daobilige Su

Authors

Daobilige Su
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Vidal-Calleja
View author publications
You can also search for this author in PubMed Google Scholar
Jaime Valls Miro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daobilige Su.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Jacobian of the observation model of a 2D microphone array

In this appendix, Jacobian matrix $J_{mic\_n}^{p-l}$, $J_{k\_x}^{p-l}$ and $J_{k\_y}^{p-l}$ of the observation model of a 2D microphone array are formulated.

Jacobian matrix $J_{mic\_n}^{p-l}$ in Eq. (19) is only nonzero at row n and this nonzero row is computed as

$$\begin{aligned} \begin{aligned}&J_{mic\_n}^{p-l}(n,:)\\&=\begin{bmatrix} \dfrac{x_{mic\_n}^x - x_{src\_k}^x}{c\sqrt{(x_{mic\_n}^x - x_{src\_k}^x)^2 + (x_{mic\_n}^y - x_{src\_k}^y)^2}}\\ \dfrac{x_{mic\_n}^y - x_{src\_k}^y}{c\sqrt{(x_{mic\_n}^x - x_{src\_k}^x)^2 + (x_{mic\_n}^y - x_{src\_k}^y)^2}}\\ 1\\ k{\varDelta }t \end{bmatrix}^{T}. \end{aligned} \end{aligned}$$

(32)

Jacobian matrix $J_{k\_x}^{p-l}$ and $J_{k\_y}^{p-l}$ are computed as

$$\begin{aligned} J_{k\_x}^{p-l}= & {} \begin{bmatrix} \dfrac{x_{src\_k}^x - x_{mic\_2}^x}{c\sqrt{(x_{mic\_2}^x - x_{src\_k}^x)^2 + (x_{mic\_2}^y - x_{src\_k}^y)^2}} \\ \vdots \\ \dfrac{x_{src\_k}^x - x_{mic\_N}^x}{c\sqrt{(x_{mic\_N}^x - x_{src\_k}^x)^2 + (x_{mic\_N}^y - x_{src\_k}^y)^2}} \end{bmatrix}\nonumber \\&-\begin{bmatrix} \dfrac{x_{src\_k}^x}{c\sqrt{{x_{src\_k}^x}^2 + {x_{src\_k}^y}^2}}\\ \vdots \\ \dfrac{x_{src\_k}^x}{c\sqrt{{x_{src\_k}^x}^2 + {x_{src\_k}^y}^2}}\\ \end{bmatrix}, \end{aligned}$$

(33)

$$\begin{aligned} J_{k\_y}^{p-l}= & {} \begin{bmatrix} \dfrac{x_{src\_k}^y - x_{mic\_2}^y}{c\sqrt{(x_{mic\_2}^x - x_{src\_k}^x)^2 + (x_{mic\_2}^y - x_{src\_k}^y)^2}} \\ \vdots \\ \dfrac{x_{src\_k}^y - x_{mic\_N}^y}{c\sqrt{(x_{mic\_N}^x - x_{src\_k}^x)^2 + (x_{mic\_N}^y - x_{src\_k}^y)^2}} \end{bmatrix}\nonumber \\&-\begin{bmatrix} \dfrac{x_{src\_k}^y}{c\sqrt{{x_{src\_k}^x}^2 + {x_{src\_k}^y}^2}}\\ \vdots \\ \dfrac{x_{src\_k}^y}{c\sqrt{{x_{src\_k}^x}^2 + {x_{src\_k}^y}^2}}\\ \end{bmatrix}. \end{aligned}$$

(34)

Jacobian of the observation model of a 3D microphone array

In this appendix, Jacobian matrix $J_{mic\_n}^{p-l}$, $J_{k\_x}^{p-l}$, $J_{k\_y}^{p-l}$ and $J_{k\_z}^{p-l}$ of the observation model of a 3D microphone array are formulated.

For a 3D microphone array, $J_{mic\_n}^{p-l}$ is Eq. (32) is rewritten as

$$\begin{aligned} \begin{aligned} J_{mic\_n}^{p-l}(n,:)= \begin{bmatrix} \dfrac{x_{mic\_2}^x - x_{src\_k}^x}{c{d}_{n,k}}\\ \dfrac{x_{mic\_2}^y - x_{src\_k}^y}{c{d}_{n,k}}\\ \dfrac{x_{mic\_2}^z - x_{src\_k}^z}{c{d}_{n,k}}\\ 1\\ k{\varDelta }t \end{bmatrix}^{T}. \end{aligned} \end{aligned}$$

(35)

$J_{k\_x}^{p-l}$, $J_{k\_y}^{p-l}$ and $J_{k\_z}^{p-l}$ in Eq. 20 can be formulated as follows,

$$\begin{aligned}&\begin{aligned} J_{k\_x}^{p-l}=\begin{bmatrix} \dfrac{x_{src\_k}^x - x_{mic\_2}^x}{c{d}_{n,k}} \\ \vdots \\ \dfrac{x_{src\_k}^x - x_{mic\_N}^x}{c{d}_{n,k}} \end{bmatrix} -\begin{bmatrix} \dfrac{x_{src\_k}^x}{c{d_k}}\\ \vdots \\ \dfrac{x_{src\_k}^x}{c{d_k}}\\ \end{bmatrix}, \end{aligned} \end{aligned}$$

(36)

$$\begin{aligned}&\begin{aligned} J_{k\_y}^{p-l}=\begin{bmatrix} \dfrac{x_{src\_k}^y - x_{mic\_2}^y}{c{d}_{n,k}} \\ \vdots \\ \dfrac{x_{src\_k}^y - x_{mic\_N}^y}{c{d}_{n,k}} \end{bmatrix} -\begin{bmatrix} \dfrac{x_{src\_k}^y}{c{d_k}}\\ \vdots \\ \dfrac{x_{src\_k}^y}{c{d_k}}\\ \end{bmatrix}, \end{aligned} \end{aligned}$$

(37)

$$\begin{aligned}&\begin{aligned} J_{k\_z}^{p-l}=\begin{bmatrix} \dfrac{x_{src\_k}^z - x_{mic\_2}^z}{c{d}_{n,k}} \\ \vdots \\ \dfrac{x_{src\_k}^z - x_{mic\_N}^z}{c{d}_{n,k}} \end{bmatrix} -\begin{bmatrix} \dfrac{x_{src\_k}^z}{c{d_k}}\\ \vdots \\ \dfrac{x_{src\_k}^z}{c{d_k}}\\ \end{bmatrix}, \end{aligned} \end{aligned}$$

(38)

where ${d_k}$ is the distance from the sound source position at the kth time instance to the origin of the global coordinate frame, which is formulated as follows,

$$\begin{aligned} d_k = \sqrt{{x_{src\_k}^x}^2 + {x_{src\_k}^y}^2 + {x_{src\_k}^z}^2}. \end{aligned}$$

(39)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, D., Vidal-Calleja, T. & Miro, J.V. Asynchronous microphone arrays calibration and sound source tracking. Auton Robot 44, 183–204 (2020). https://doi.org/10.1007/s10514-019-09885-w

Download citation

Received: 26 May 2018
Accepted: 16 August 2019
Published: 31 August 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s10514-019-09885-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asynchronous microphone arrays calibration and sound source tracking

Abstract

Access this article

Similar content being viewed by others

Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions

Generalization of the Standard Beamforming Algorithm for Identifying Acoustic Sources Using Asynchronous Microphone Array Measurements

Multisensor Acoustical Systems: Calibration and Related Problems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Jacobian of the observation model of a 2D microphone array

Jacobian of the observation model of a 3D microphone array

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Asynchronous microphone arrays calibration and sound source tracking

Abstract

Access this article

Similar content being viewed by others

Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions

Generalization of the Standard Beamforming Algorithm for Identifying Acoustic Sources Using Asynchronous Microphone Array Measurements

Multisensor Acoustical Systems: Calibration and Related Problems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Jacobian of the observation model of a 2D microphone array

Jacobian of the observation model of a 3D microphone array

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation