2020 |
Christian Werner, Nikos Kardaris, Petros Koutras, Athanasia Zlatintsi, Petros Maragos, Jürgen M Bauer, Klaus Hauer Improving gesture-based interaction between an assistive bathing robot and older adults via user training on the gestural commands Journal Article Archives of Gerontology and Geriatrics, 87 , pp. 103996, 2020, ISSN: 0167-4943. Abstract | BibTeX | Links: [PDF] @article{WERNER2020103996, title = {Improving gesture-based interaction between an assistive bathing robot and older adults via user training on the gestural commands}, author = {Christian Werner and Nikos Kardaris and Petros Koutras and Athanasia Zlatintsi and Petros Maragos and Jürgen M Bauer and Klaus Hauer}, url = {http://www.sciencedirect.com/science/article/pii/S0167494319302390}, doi = {https://doi.org/10.1016/j.archger.2019.103996}, issn = {0167-4943}, year = {2020}, date = {2020-03-01}, journal = {Archives of Gerontology and Geriatrics}, volume = {87}, pages = {103996}, abstract = {Background Gesture-based human-robot interaction (HRI) depends on the technical performance of the robot-integrated gesture recognition system (GRS) and on the gestural performance of the robot user, which has been shown to be rather low in older adults. Training of gestural commands (GCs) might improve the quality of older users’ input for gesture-based HRI, which in turn may lead to an overall improved HRI. Objective To evaluate the effects of a user training on gesture-based HRI between an assistive bathing robot and potential elderly robot users. Methods Twenty-five older adults with bathing disability participated in this quasi-experimental, single-group, pre-/post-test study and underwent a specific user training (10−15 min) on GCs for HRI with the assistive bathing robot. Outcomes measured before and after training included participants’ gestural performance assessed by a scoring method of an established test of gesture production (TULIA) and sensor-based gestural performance (SGP) scores derived from the GRS-recorded data, and robot’s command recognition rate (CRR). Results Gestural performance (TULIA = +57.1 ± 56.2 %, SGP scores = +41.1 ± 74.4 %) and CRR (+31.9 ± 51.2 %) significantly improved over training (p < .001). Improvements in gestural performance and CRR were highly associated with each other (r = 0.80–0.81, p < .001). Participants with lower initial gestural performance and higher gerontechnology anxiety benefited most from the training. Conclusions Our study highlights that training in gesture-based HRI with an assistive bathing robot is highly beneficial for the quality of older users’ GCs, leading to higher CRRs of the robot-integrated GRS, and thus to an overall improved HRI.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Background Gesture-based human-robot interaction (HRI) depends on the technical performance of the robot-integrated gesture recognition system (GRS) and on the gestural performance of the robot user, which has been shown to be rather low in older adults. Training of gestural commands (GCs) might improve the quality of older users’ input for gesture-based HRI, which in turn may lead to an overall improved HRI. Objective To evaluate the effects of a user training on gesture-based HRI between an assistive bathing robot and potential elderly robot users. Methods Twenty-five older adults with bathing disability participated in this quasi-experimental, single-group, pre-/post-test study and underwent a specific user training (10−15 min) on GCs for HRI with the assistive bathing robot. Outcomes measured before and after training included participants’ gestural performance assessed by a scoring method of an established test of gesture production (TULIA) and sensor-based gestural performance (SGP) scores derived from the GRS-recorded data, and robot’s command recognition rate (CRR). Results Gestural performance (TULIA = +57.1 ± 56.2 %, SGP scores = +41.1 ± 74.4 %) and CRR (+31.9 ± 51.2 %) significantly improved over training (p < .001). Improvements in gestural performance and CRR were highly associated with each other (r = 0.80–0.81, p < .001). Participants with lower initial gestural performance and higher gerontechnology anxiety benefited most from the training. Conclusions Our study highlights that training in gesture-based HRI with an assistive bathing robot is highly beneficial for the quality of older users’ GCs, leading to higher CRRs of the robot-integrated GRS, and thus to an overall improved HRI. |
2018 |
Mehdi Khamassi, George Velentzas, Theodore Tsitsimis, Costas Tzafestas Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning Journal Article IEEE Transactions on Cognitive and Developmental Systems, 10 , pp. 881 - 893, 2018. Abstract | BibTeX | Links: [PDF] @article{BFB99, title = {Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning}, author = {Mehdi Khamassi and George Velentzas and Theodore Tsitsimis and Costas Tzafestas}, url = {http://robotics.ntua.gr/wp-content/publications/Khamassi_TCDS2018.pdf}, doi = {10.1109/TCDS.2018.2843122}, year = {2018}, date = {2018-01-01}, journal = { IEEE Transactions on Cognitive and Developmental Systems}, volume = {10}, pages = {881 - 893}, publisher = {IEEE}, abstract = {Dynamic uncontrolled human-robot interactions (HRI) require robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. However, robot reinforcement learning (RL) abilities to adapt to these non-verbal cues are still underdeveloped. Here we propose an active exploration algorithm for RL during HRI where the reward function is the weighted sum of the human’s current engagement and variations of this engagement. We use a parameterized action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete action space (e.g. moving an object) and in the space of continuous characteristics of movement (e.g. velocity, direction, strength, expressivity). We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm. We then apply it to a simulated HRI task, and show that it outperforms continuous parameterized RL with either passive or active exploration based on different existing methods. We finally test the performance in a more realistic test of the same HRI task, where a practical approach is followed to estimate human engagement through visual cues of the head pose. The algorithm can detect and adapt to perturbations in human engagement with different durations. Altogether, these results suggest a novel efficient and robust framework for robot learning during dynamic HRI scenarios.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Dynamic uncontrolled human-robot interactions (HRI) require robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. However, robot reinforcement learning (RL) abilities to adapt to these non-verbal cues are still underdeveloped. Here we propose an active exploration algorithm for RL during HRI where the reward function is the weighted sum of the human’s current engagement and variations of this engagement. We use a parameterized action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete action space (e.g. moving an object) and in the space of continuous characteristics of movement (e.g. velocity, direction, strength, expressivity). We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm. We then apply it to a simulated HRI task, and show that it outperforms continuous parameterized RL with either passive or active exploration based on different existing methods. We finally test the performance in a more realistic test of the same HRI task, where a practical approach is followed to estimate human engagement through visual cues of the head pose. The algorithm can detect and adapt to perturbations in human engagement with different durations. Altogether, these results suggest a novel efficient and robust framework for robot learning during dynamic HRI scenarios. |
2017 |
Theodore Tsitsimis, George Velentzas, Mehdi Khamassi, Costas Tzafestas Online adaptation to human engagement perturbations in simulated human-robot interaction using hybrid reinforcement learning Conference Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017., Kos, Greece, 2017. Abstract | BibTeX | Links: [PDF] @conference{BFB98, title = {Online adaptation to human engagement perturbations in simulated human-robot interaction using hybrid reinforcement learning}, author = {Theodore Tsitsimis and George Velentzas and Mehdi Khamassi and Costas Tzafestas}, editor = {Michael Aron}, url = {http://robotics.ntua.gr/wp-content/uploads/sites/2/MultiLearn2017.pdf}, year = {2017}, date = {2017-08-01}, booktitle = {Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017.}, address = {Kos, Greece}, abstract = {Dynamic uncontrolled human-robot interaction requires robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. In a previous work [1] we proposed an active exploration algorithm for reinforcement learning where the reward function is the weighted sum of the human’s current engagement and variations of this engagement (so that a low but increasing engagement is rewarding). We used a structured (parameterized) continuous action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete and continuous action space, enabling the robot to learn which discrete action is expected by the human (e.g. moving an object) and with which velocity of movement. In this paper we want to show the performance of the algorithm to a simulated humanrobot interaction task where a practical approach is followed to estimate human engagement through visual cues of the head pose. We then measure the adaptation of the algorithm to engagement perturbations simulated as changes in the optimal action parameter and we quantify its performance for variations in perturbation duration and measurement noise.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } Dynamic uncontrolled human-robot interaction requires robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. In a previous work [1] we proposed an active exploration algorithm for reinforcement learning where the reward function is the weighted sum of the human’s current engagement and variations of this engagement (so that a low but increasing engagement is rewarding). We used a structured (parameterized) continuous action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete and continuous action space, enabling the robot to learn which discrete action is expected by the human (e.g. moving an object) and with which velocity of movement. In this paper we want to show the performance of the algorithm to a simulated humanrobot interaction task where a practical approach is followed to estimate human engagement through visual cues of the head pose. We then measure the adaptation of the algorithm to engagement perturbations simulated as changes in the optimal action parameter and we quantify its performance for variations in perturbation duration and measurement noise. |
Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task Conference Proc. IEEE Int'l Conference on Robotic Computing, Taichung, Taiwan, 2017. Abstract | BibTeX | Links: [PDF] @conference{BFB95, title = {Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task}, url = {http://robotics.ntua.gr/wp-content/publications/khamassi_IRC2017.pdf}, doi = {10.1109/IRC.2017.33}, year = {2017}, date = {2017-04-01}, booktitle = {Proc. IEEE Int'l Conference on Robotic Computing}, address = {Taichung, Taiwan}, abstract = {Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature β parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and long-term reward running averages to simultaneously tune β and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature β parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and long-term reward running averages to simultaneously tune β and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm. |
Mehdi Khamassi, George Velentzas, Theodore Tsitsimis, Costas Tzafestas Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task Conference Proceedings - 2017 1st IEEE International Conference on Robotic Computing, IRC 2017, 2017, ISBN: 9781509067237. Abstract | BibTeX | Links: [PDF] @conference{337, title = {Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task}, author = { Mehdi Khamassi and George Velentzas and Theodore Tsitsimis and Costas Tzafestas}, url = {http://ieeexplore.ieee.org/document/7926511/%0Ahttp://ieeexplore.ieee.org/ielx7/7925476/7926477/07926511.pdf?tp=&arnumber=7926511&isnumber=7926477}, doi = {10.1109/IRC.2017.33}, isbn = {9781509067237}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings - 2017 1st IEEE International Conference on Robotic Computing, IRC 2017}, pages = {28--35}, abstract = {textcopyright 2017 IEEE. Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature $beta$ parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-Term and long-Term reward running averages to simultaneously tune $beta$ and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-The-Art performance in the non-stationary multi-Armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } textcopyright 2017 IEEE. Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature $beta$ parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-Term and long-Term reward running averages to simultaneously tune $beta$ and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-The-Art performance in the non-stationary multi-Armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm. |
Copyright Notice:
Some material presented is available for download to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
The work already published by the IEEE is under its copyright. Personal use of such material is permitted. However, permission to reprint/republish the material for advertising or promotional purposes, or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of the work in other works must be obtained from the IEEE.