Detecting, locating and recognising human touches in social robots with contact microphones

https://doi.org/10.1016/j.engappai.2020.103670Get rights and content

Highlights

  • Acoustic sensors provide high coverage at a low cost.

  • Acoustic sensors they are not affected by usual ambient sounds propagated in the air.

  • The system achieves high accuracy with less complexity than state of the art approaches.

  • Acoustic sensors suitability is demonstrated in two real social robotic platforms.

Abstract

There are many situations in our daily life where touch gestures during natural human–human interaction take place: meeting people (shaking hands), personal relationships (caresses), moments of celebration or sadness (hugs), etc. Considering that robots are expected to form part of our daily life in the future, they should be endowed with the capacity of recognising these touch gestures and the part of its body that has been touched since the gesture’s meaning may differ. Therefore, this work presents a learning system for both purposes: detect and recognise the type of touch gesture (stroke, tickle, tap and slap) and its localisation. The interpretation of the meaning of the gesture is out of the scope of this paper.

Different technologies have been applied to perceive touch by a social robot, commonly using a large number of sensors. Instead, our approach uses 3 contact microphones installed inside some parts of the robot. The audio signals generated when the user touches the robot are sensed by the contact microphones and processed using Machine Learning techniques. We acquired information from sensors installed in two social robots, Maggie and Mini (both developed by the RoboticsLab at the Carlos III University of Madrid), and a real-time version of the whole system has been deployed in the robot Mini. The system allows the robot to sense if it has been touched or not, to recognise the kind of touch gesture, and its approximate location. The main advantage of using contact microphones as touch sensors is that by using just one, it is possible to “cover” a whole solid part of the robot. Besides, the sensors are unaffected by ambient noises, such as human voice, TV, music etc. Nevertheless, the fact of using several contact microphones makes possible that a touch gesture is detected by all of them, and each may recognise a different gesture at the same time. The results show that this system is robust against this phenomenon. Moreover, the accuracy obtained for both robots is about 86%.

Introduction

During human–human interaction there are many communication channels, some of them are verbal, and others not verbal, such as facial expressions, body gestures, and, in many cultures, physical interaction, that is, touch gestures (Gallace and Spence, 2010). In fact, in some situations, humans try to communicate, or even emphasise, important social messages using these non-verbal communication channels (e.g. to tap on the back trying to get a person’s attention, to comfort someone giving him a hug, or when a father caresses his crying son’s face) (Hertenstein et al., 2006, Hertenstein et al., 2009). All those touch gestures are easily recognisable for everyone (depending on their culture) since complex languages have incorporated these touch interactions (Wilhelm et al., 2001). On the other hand, it is important to note that, depending on the part of the body that has been touched, gestures may have different meanings.

Due to the ability of touch gestures to communicate or improve the other communication channels, some studies have explored how this kind of interaction can be used to improve Human-Robot Interaction (HRI) (Schmid et al., 2007, Altun and MacLean, 2015). In fact, considering that social robots are designed to interact with people and are expected to behave following social norms during HRI, it seems logical to think that they should be able to perceive and recognise1 different touch gestures to behave appropriately (Kim et al., 2010, Silvera-Tawil et al., 2011, Altun and MacLean, 2015, Jung et al., 2015). As an example, Paro, the seal-robot designed to interact with elders with cognitive impairment, has proven to improve their mood by taking and stroking the robot as if it were a real animal (Sabanovic et al., 2013, Sharkey and Wood, 2014).

With respect to sensing technologies, social robots are commonly endowed with basic tactile sensors (e.g. capacitive, force or temperature sensors) to detect physical contact (throughout the text this will be called ‘touch activity detection’). Some studies have aimed to recognise ‘touch gestures’ (contacts made on a surface with a certain communicative intention) using these devices (Argall and Billard, 2010). Nevertheless, most of these proposals usually require important hardware deployments that are oriented towards equipping the robot with large amounts of sensors (Stiehl et al., 2005). Moreover, dealing with such a large amount of sensors increases the chance of false positives, leading to low recognition rates.

The objective of the work presented in this paper is to develop and implement a system to learn to recognise and localise a touch gesture made on a social robot. Besides, this system uses a small number of sensors unlike other approaches described above. The present work builds on the preliminary results presented in Alonso-Martín et al. (2017). Our contribution explores a novel application of a sensing technology, piezoelectric pickups, in Human-Robot Touch Interaction. These devices perceive the sound vibrations that are generated when a user touches the robot’s surface. The perturbations originated by the contact propagate through the rigid parts of the robot (its shell and inner structure). One of the main advantages that these devices offer is that they are not affected by usual ambient sounds propagated in the air, such as the human voice, TV, music, etc.

The system is able to classify both contact location and touch gestures, working as follows: When one of multiple sensors perceive an interaction, the system commences to process their signals separately, extracting a group of features that belong to the time and frequency audio domains2 from each of them such as Signal to Noise Ratio, Pitch or duration. When the system detects that the contact has ended, it computes the features’ average values during this timespan. These values are then grouped into labelled instances that represent the contact performed. A dataset composed by a set of these instances is later used as training data for further classification of the gesture through machine learning techniques.

The main contributions of this paper are: the ability to recognise touch gestures on the different parts of the robot’s surface, and to identify the zone in which the touch gesture takes place. Both, the type of gesture and its localisation are fundamental to be able to make a correct interpretation of the communicative message of the user. This paper focuses on the learning process of the recognition of the type of gesture and its localisation, the interpretation of its possible meaning is out of the scope of this paper.

Moreover, our system also follows a modular design that is able to adapt to different robotic platforms. In order to prove this, data acquisition part of the system proposed has been implemented on two robotic platforms, the social robots Maggie (Gonzalez-Pacheco et al., 2011) and Mini (Salichs San Jose et al., 2016), which have been developed by the RoboticsLab at the Carlos III of Madrid University. Both robots have a similar physical structure: head, body and two arms at their sides. Finally, the real-time version of the whole system has been integrated in the robot Mini.

As will be later explained, three piezoelectric pickups (or contact microphones) will be integrated inside the rigid parts of each of the robots’ shells: in their heads and inside each arm. As already explained, this type of touch sensors perceives the sound vibrations generated when the robot’s surface is touched and these perturbations propagate through the rigid parts of the robot. Therefore, we have different microphones detecting simultaneously in connected rigid bodies (head and arms). This is important because extending the touch recognition to multiple sensing devices means that touch gestures can be recognised by several sensors at the same time. This fact generates a problem considering how the nature of the sound signal affects both localisation and identification of gestures. On one hand, regarding localisation, sound propagates over the rigid parts of the robot (hard shells and inner structure) and different microphones may sense it at the same time. This behaviour complicates the task of locating where the user touched the robot. On the other hand, concerning touch gesture recognition, the propagation of the sound signal changes as it moves through the irregular surface and inner structure of the robot (note that our robots are composed by different rigid parts connected). This causes that each microphone perceives the signal in a different way depending on the distance that separates each sensor from the signal source. For instance, touching the arm of the robot, not only introduces the vibrations corresponding to the touch itself, but also the extra movement caused by the force exerted on that part.

Therefore, it is possible that each sensor recognises different touch gestures at the same time. In this work, we studied all these effects and our results show that our system is robust against these phenomena.

The rest of the paper is structured as follows. First, Section 2 reviews the literature related to touch interaction in social robotics and describes different systems that integrate contact microphones to locate and classify touch gestures on different kinds of surfaces. Section 3 details the hardware involved in this proposal: two robotic platforms with contact microphones beneath their respective shells. Then, Section 4 explains the structure and the phases of the proposed system primarily at a software level. Section 5, describes the experimental part of this paper: the set of gestures selected, the data collection process, the way to construct the dataset and the evaluation metrics used. Next, Section 6 and Section 7 present and discuss the results obtained in the classification process, respectively. Finally, Section 8 shows the conclusions that have been gathered from the results of the previous section.

Section snippets

Related work

In recent years, touch interaction has attracted attention and has been introduced in many different areas, such as domotics, electronics, or robotics. Nowadays, devices such as smartphones, wearables, laptops or tactile fingerprint sensors implement technology related to touch interaction (Murray-Smith et al., 2008, Robinson et al., 2011, Wang et al., 2019). In this section, we analyse the literature, while paying special attention to two trends: first, we review the solutions related to Touch

Hardware components of the system

The system proposed in this work integrates piezoelectric pickups in two robotic platforms as touch sensors. This section provides some insights into the hardware platforms, detailing how the sensors have been installed on the rigid surfaces of the platforms.

Software components of the system

This section describes the different phases of our touch gesture recognition system, namely: Sound Acquisition (SA), Feature Extraction and Touch Activity Detection (FED), Instance Creation (IC), and finally, Touch Classification and Localisation (TCL). Fig. 2 shows a summarised view of the operation flow where each contact microphone implies a pipeline of audio analysis consisting in the SA and FED phases. The touches performed on the robot’s shell are received by all of the contact

Methods

This section describes the set of gestures selected in this work. These gestures are composed of a series of datasets that are built after the users have interacted with the robots. This information was then used to train the classifiers and, through a series of metrics, assess their performance.

Results

This section presents the results for the two possible classification approaches in both robots. We have reported the results collected in four tables. Although only the top ten classifiers are included, more than a hundred classifiers have been trained and evaluated with various configurations. Each classifier was trained more than once, particularly 10 times, using different configuration parameters to find the best scores. While we are aware that finding the best configuration parameters

Discussion

The results described in the previous section show that this proposal provides high accuracy with a less complex deployment (i.e. low amount of sensors and relatively simple installation) when compared to other systems presented in the literature (see Section 2). The first learning approach is the multi-class technique, which has obtained a high F-score for both robots. For the social robot Maggie, the global accuracy was F-score =0.858; for Mini, competitive results were also obtained: F-score

Conclusions

Social robots are expected to form part of our daily life, so endowing the robot with the ability to recognise different kinds of touch gestures performed by the user poses an important challenge in HRI. Consequently, this paper proposes a new touch-sensing technology in the field of social robotics that is able to detect, recognise and localise touch gestures in a whole robot shell using a few sensors. In contrast, although traditional touch sensing technologies (e.g. resistive, capacitive or

CRediT authorship contribution statement

Juan José Gamboa-Montero: Conceptualization, Methodology, Software, Validation, Investigation, Formal analysis, Writing - original draft, Writing - review & editing. Fernando Alonso-Martín: Conceptualization, Methodology, Software, Validation, Investigation, Data curation, Writing - original draft, Writing - review & editing. José Carlos Castillo: Conceptualization, Methodology, Investigation, Writing - original draft, Writing - review & editing, Supervision. María Malfaz: Writing - original

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The research leading to these results has received funding from the projects: “Robots Sociales para Estimulación Física, Cognitiva y Afectiva de Mayores (ROSES)”, funded by the Spanish “Ministerio de Ciencia, Innovación y Universidades, Spain” and from RoboCity2030-DIH-CM, Madrid Robotics Digital Innovation Hub, S2018/NMT-4331, funded by “Programas de Actividades I+D en la Comunidad de Madrid” and cofunded by Structural Funds of the EU, Slovak Republic .

Juan José Gamboa-Montero works as a researcher and teacher at the Systems Engineering and Automation Department, Carlos III University of Madrid. He received the B.Sc. in Industrial Electronics and Automation Engineering and M.Sc. in Robotics and Automation and form the Carlos III University of Madrid in 2015 and 2016, respectively. Currently, he is a Ph. D. student in the Electrical Engineering, Electronics and Automation program. His research field is primarily related to Human-Robot Touch

References (56)

  • Alonso-MartinF. et al.

    A multimodal emotion detection system during human-robot interaction

    Sensors

    (2013)
  • Alonso-MartinF. et al.

    Speaker identification using three signal voice domains during human-robot interaction

  • AppiceA. et al.

    Stepwise induction of multi-target model trees

  • BreimanL.

    The little bootstrap and other methods for dimensionality selection in regression: X-fixed prediction error

    J. Amer. Statist. Assoc.

    (1992)
  • CarlsonA.
  • ChoH.-s. et al.

    Accurate distance estimation between things: A self-correcting approach

    Open J. Internet Things

    (2015)
  • Cochran, W., Cooley, J., Favin, D., Helms, H., Kaenel, R., Lang, W., Maling, G., Nelson, D., Rader, C., Welch, P.,...
  • CooneyM.D. et al.

    Recognizing affection for a touch-based interaction with a humanoid robot

  • FirouziK. et al.

    Lamb wave multitouch ultrasonic touchscreen

    IEEE Trans. Ultrason. Ferroelectr. Freq. Control

    (2016)
  • GodboleS. et al.

    Discriminative methods for multi-labeled classification

  • Gonzalez-PachecoV. et al.

    Maggie: A social robot as a gaming platform

    Int. J. Soc. Robot.

    (2011)
  • GorisK. et al.

    Mechanical design of the Huggable robot Probo

    Int. J. Humanoid Robot.

    (2011)
  • HastieT. et al.
  • HertensteinM. et al.

    The Communication of Emotion Via Touch, Vol. 9

    (2009)
  • HertensteinM.J. et al.

    The communicative functions of touch in humans, nonhuman primates, and rats: a review and synthesis of the empirical research

    Genet. Soc. Gen. Psychol. Monogr.

    (2006)
  • HolmesG. et al.

    WEKA: a machine learning workbench

  • HughesD. et al.

    Recognizing social touch gestures using recurrent and convolutional neural networks

  • JungM.M. et al.

    Touch challenge ’15: Recognizing social touch gestures

  • Cited by (18)

    • Emotion space modelling for social robots

      2021, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      It presents the robot type and their basic information in chronological order whilst no comparison is enumerated in terms of the robot’s specific functions. Compared with industrial robots, these robots usually focus more on user experience (e.g., the robot’s facial changes to the users or its movements caused by user’s touch Gamboa-Montero et al., 2020). Therefore, most of their innovations are found in their design of software systems and computational models.

    • Design and Development of Land Rover for Disaster Management

      2023, 2023 Innovations in Power and Advanced Computing Technologies, i-PACT 2023
    • Step by Step Building and Evaluation of Low-Cost Capacitive Technology Touch System for Human-Social Robot Interaction

      2023, 14th International Conference on Information, Intelligence, Systems and Applications, IISA 2023
    View all citing articles on Scopus

    Juan José Gamboa-Montero works as a researcher and teacher at the Systems Engineering and Automation Department, Carlos III University of Madrid. He received the B.Sc. in Industrial Electronics and Automation Engineering and M.Sc. in Robotics and Automation and form the Carlos III University of Madrid in 2015 and 2016, respectively. Currently, he is a Ph. D. student in the Electrical Engineering, Electronics and Automation program. His research field is primarily related to Human-Robot Touch Interaction, but it also includes Social Robotics, Advanced Machine Learning, Signal Processing and Cognitive Robotics. He was involved in RobAlz project, a project born from the collaboration between the Spanish Alzheimer Foundation and the RoboticsLab. The aim was to develop robots that assist in the daily tasks of caregivers to Alzheimer’s sufferers.

    Fernando Alonso-Martín received the B.Sc. degree in computer engineering from the Carlos III University of Madrid, Madrid, Spain, in 2007, and the M.Sc. and Ph.D. degrees in Robotics and Automation from the Carlos III University in 2009 and 2014, respectively. He is a member of the Robotics Lab Research Group and Assistant Professor at the Department of Systems Engineering and Automation of the Carlos III University. His research fields are personal robots, human-robot interaction, dialogues, and other related issues. Currently he is mainly involved in two projects: RobAlz is a project born from the collaboration between the Spanish Alzheimer Foundation and the RoboticsLab. The aim is to develop robots that assist in the daily tasks of caregivers to Alzheimer’s sufferers. In the case of the MOnarCH (Multi-Robot Cognitive Systems Operating in Hospitals) project, it is a European Union FP7 project that aims at the development of a network of heterogeneous robots and sensors in the pediatric area of an oncological hospital.

    José Carlos Castillo Montoya obtained his Ph.D. in Advanced Computing Systems from the University of Castilla-La Mancha, Spain, in 2012. He received the B.Sc. and M.Sc. degrees in Computer Engineering from the University of Castilla-La Mancha in 2006 and 2009, respectively. As a post-doc researcher, he worked at the Institute for Systems and Robotics (ISR), Instituto Superior Técnico (IST) of Lisbon, where he was involved in the development of networked robot systems, robotics and computer vision and intelligent control systems. From 2013 to date he is combining teaching and research at the RoboticsLab at University Carlos III of Madrid in the fields of HRI and perception.

    María Malfaz is an assistant professor of the Systems Engineering and Automation Department at the Carlos III University of Madrid. Mara Malfaz received her degree in Physics at La Laguna University in 1999. In October 2001, she received a MSc in Control Systems from Imperial College of London. She received the Ph.D. degree in Industrial Engineering in 2007; the thesis was ‘Decision Making System for Autonomous Social Agents Based on Emotions and Self-learning’. Her research area follows the line carried out in her thesis and, more recently, she has been working on multimodal human–robot interaction systems. She belongs to several international scientific associations: the IEEE-RAS (IEEE Robotics and Automation Society), the IFAC (International Association of Automatic Control), and the CEA (Comit Espaol de Automtica). Moreover, she is also a member of research networks such as EURobotics (European Robotics Coordination Action), and HispaRob (Plataforma Tecnolgica Espaola de Robtica).

    Miguel A. Salichs received the electrical engineering and Ph.D. degrees from Polytechnic University of Madrid, Madrid, Spain. He is currently a Full Professor of the Systems Engineering and Automation Department at the Carlos III University, Madrid, Spain. His research interests include autonomous social robots, human–robot interaction, bio-inspired robotics, and cognitive architectures. Prof. Salichs was member of the Policy Committee of the International Federation of Automatic Control (IFAC), chairman of the Technical Committee on Intelligent Autonomous Vehicles of the IFAC, head of the Spanish National Research Program on Industrial Design and Production, the President of the Spanish Society on Automation and Control (CEA), and the Spanish representative to the European Robotics Research Network (EURON).

    View full text