2018 |
Flokas, Lampros; Maragos, Petros Online Wideband Spectrum Sensing Using Sparsity Journal Article IEEE Journal of Selected Topics in Signal Processing, 12 (1), pp. 35–44, 2018, ISSN: 19324553. Abstract | BibTeX | Tags: cognitive radio, LMS, signal processing, sparse representations | Links: @article{349, title = {Online Wideband Spectrum Sensing Using Sparsity}, author = {Lampros Flokas and Petros Maragos}, journal = {IEEE Journal of Selected Topics in Signal Processing}, volume = {12}, number = {1}, pages = {35--44}, } Wideband spectrum sensing is an essential part of cognitive radio systems. Exact spectrum estimation is usually inefficient as it requires sampling rates at or above the Nyquist rate. Using prior information on the structure of the signal could allow near exact reconstruction at much lower sampling rates. Sparsity of the sampled signal in the frequency domain is one of the popular priors studied for cognitive radio applications. Reconstruction of signals under sparsity assumptions has been studied rigorously by researchers in the field of Compressed Sensing (CS). CS algorithms that operate on batches of samples are known to be robust but can be computationally costly, making them unsuitable for cheap low power cognitive radio devices that require spectrum sensing in real time. On the other hand, online algorithms that are based on variations of the Least Mean Squares (LMS) algorithm have very simple updates so they are computationally efficient and can easily adapt in real time to changes of the underlying spectrum. In this paper we will present two variations of the LMS algorithm that enforce sparsity in the estimated spectrum given an upper bound on the number of non- zero coefficients. Assuming that the number of non-zero elements in the spectrum is known we show that under conditions the hard threshold operation can only reduce the error of our estimation. We will also show that we can estimate the number of non-zero elements of the spectrum at each iteration based on our online estimations. Finally, we numerically compare our algorithm with other online sparsity-inducing algorithms in the literature. |
Khamassi, Mehdi; Velentzas, George; Tsitsimis, Theodore; Tzafestas, Costas Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning Journal Article IEEE Transactions on Cognitive and Developmental Systems, 10 , pp. 881 - 893, 2018. Abstract | BibTeX | Tags: Active exploration, Bandits, Human-robot interaction, Meta-learning, reinforcement learning | Links: @article{BFB99, title = {Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning}, author = {Mehdi Khamassi and George Velentzas and Theodore Tsitsimis and Costas Tzafestas}, journal = { IEEE Transactions on Cognitive and Developmental Systems}, volume = {10}, pages = {881 - 893}, publisher = {IEEE}, } Dynamic uncontrolled human-robot interactions (HRI) require robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. However, robot reinforcement learning (RL) abilities to adapt to these non-verbal cues are still underdeveloped. Here we propose an active exploration algorithm for RL during HRI where the reward function is the weighted sum of the human’s current engagement and variations of this engagement. We use a parameterized action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete action space (e.g. moving an object) and in the space of continuous characteristics of movement (e.g. velocity, direction, strength, expressivity). We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm. We then apply it to a simulated HRI task, and show that it outperforms continuous parameterized RL with either passive or active exploration based on different existing methods. We finally test the performance in a more realistic test of the same HRI task, where a practical approach is followed to estimate human engagement through visual cues of the head pose. The algorithm can detect and adapt to perturbations in human engagement with different durations. Altogether, these results suggest a novel efficient and robust framework for robot learning during dynamic HRI scenarios. |
2017 |
Maragos, P Dynamical systems on weighted lattices: general theory, Journal Article Math. Control Signals Syst., 29 (1), 2017. BibTeX | Tags: | Links: @article{Maragos2017, title = {Dynamical systems on weighted lattices: general theory,}, author = {P. Maragos}, journal = {Math. Control Signals Syst.}, volume = {29}, number = {1}, } |
Rodomagoulakis, I; Katsamanis, A; Potamianos, G; Giannoulis, P; Tsiami, A; Maragos, P Room-localized spoken command recognition in multi-room, multi-microphone environments Journal Article Computer Speech & Language, 46 , pp. 419-443, 2017. BibTeX | Tags: | Links: @article{Rodomagoulakis2017, title = {Room-localized spoken command recognition in multi-room, multi-microphone environments}, author = {I. Rodomagoulakis and A. Katsamanis and G. Potamianos and P. Giannoulis and A. Tsiami and P. Maragos}, journal = {Computer Speech & Language}, volume = {46}, pages = {419-443}, } |
Chalvatzaki, G; Papageorgiou, X S; Tzafestas, C S Towards a user-adaptive context-aware robotic walker with a pathological gait assessment system: First experimental study Conference IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017. Abstract | BibTeX | Tags: | Links: @conference{CPT17, title = {Towards a user-adaptive context-aware robotic walker with a pathological gait assessment system: First experimental study}, author = {G Chalvatzaki and X S Papageorgiou and C S Tzafestas}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, pages = {5037-5042}, } When designing a user-friendly Mobility Assistive Device (MAD) for mobility constrained people, it is important to take into account the diverse spectrum of disabilities, which results to completely different needs to be covered by the MAD for each specific user. An intelligent adaptive behavior is necessary. In this work we present experimental results, using an in house developed methodology for assessing the gait of users with different mobility status while interacting with a robotic MAD. We use data from a laser scanner, mounted on the MAD to track the legs using Particle Filters and Probabilistic Data Association (PDA-PF). The legs' states are fed to an HMM-based pathological gait cycle recognition system to compute in real-time the gait parameters that are crucial for the mobility status characterization of the user. We aim to show that a gait assessment system would be an important feedback for an intelligent MAD. Thus, we use this system to compare the gaits of the subjects using two different control settings of the MAD and we experimentally validate the ability of our system to recognize the impact of the control designs on the users' walking performance. The results demonstrate that a generic control scheme does not meet every patient's needs, and therefore, an Adaptive Context-Aware MAD (ACA MAD), that can understand the specific needs of the user, is important for enhancing the human-robot physical interaction. |
Chalvatzaki, G; Papageorgiou, X S; Tzafestas, C S; Maragos, P HMM-based Pathological Gait Analyzer for a User-Adaptive Intelligent Robotic Walker Conference Proc. 25th European Conf.(EUSIPCO-17) Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017. Abstract | BibTeX | Tags: | Links: @conference{CPTM_WML17, title = {HMM-based Pathological Gait Analyzer for a User-Adaptive Intelligent Robotic Walker}, author = {G Chalvatzaki and X S Papageorgiou and C S Tzafestas and P Maragos}, booktitle = {Proc. 25th European Conf.(EUSIPCO-17) Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications"}, } During the past decade, robotic technology has evolved considerably towards the development of cognitive robotic systems that enable close interaction with humans. Application fields of such novel robotic technologies are now wide spreading covering a variety of human assistance function- alities, aiming in particular at supporting the needs of human beings experiencing various forms of mobility or cognitive impairments. Mobility impairments are prevalent in the elderly population and constitute one of the main causes related to difficulties in performing Activities of Daily Living (ADLs) and consequent reduction of quality of life. This paper re- ports current research work related to the development of a pathological gait analyzer for intelligent robotic rollator aiming to be an input to a user-adaptive and context-aware robot control architecture. Specifically, we present a novel method for human leg tracking using Particle Filters and Probablistic Data Association from a laser scanner, constituting a non- wearable and non-intrusive approach. The tracked positions and velocities of the user’s legs are the observables of an HMM, which provides the gait phases of the detected gait cycles. Given those phases we compute specific gait parameters, which are used for medical diagnosis. The results of our pathological gait analyzer are validated using ground truth data from a GAITRite system. The results presented in this paper demonstrate that the proposed human data analysis scheme has the potential to provide the necessary methodological (modeling, inference, and learning) framework for a cognitive behavior- based robot control system. |
Dometios, A C; Papageorgiou, X S; Arvanitakis, A; Tzafestas, C S; Maragos, P Real-time End-effector Motion Behavior Planning Approach Using On-line Point-cloud Data Towards a User Adaptive Assistive Bath Robot Conference 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017. Abstract | BibTeX | Tags: | Links: @conference{DPATM17, title = {Real-time End-effector Motion Behavior Planning Approach Using On-line Point-cloud Data Towards a User Adaptive Assistive Bath Robot}, author = {A C Dometios and X S Papageorgiou and A Arvanitakis and C S Tzafestas and P Maragos}, booktitle = {2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, pages = {5031-5036}, } Elderly people have particular needs in performing bathing activities, since these tasks require body flexibility. Our aim is to build an assistive robotic bath system, in order to increase the independence and safety of this procedure. Towards this end, the expertise of professional carers for bathing sequences and appropriate motions has to be adopted, in order to achieve natural, physical human - robot interaction. In this paper, a real-time end-effector motion planning method for an assistive bath robot, using on-line Point-Cloud information, is proposed. The visual feedback obtained from Kinect depth sensor is employed to adapt suitable washing paths to the user’s body part motion and deformable surface. We make use of a navigation function-based controller, with guarantied globally uniformly asymptotic stability, and bijective transformations for the adaptation of the paths. Experiments were conducted with a rigid rectangular object for validation purposes, while a female subject took part to the experiment in order to evaluate and demonstrate the basic concepts of the proposed methodology. |
Dometios, A C; Tsiami, A; Arvanitakis, A; Giannoulis, P; Papageorgiou, X S; Tzafestas, C S; Maragos, P Integrated Speech-based Perception System for User Adaptive Robot Motion Planning in Assistive Bath Scenarios Conference Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017. Abstract | BibTeX | Tags: | Links: @conference{DTAGPTM17, title = {Integrated Speech-based Perception System for User Adaptive Robot Motion Planning in Assistive Bath Scenarios}, author = {A C Dometios and A Tsiami and A Arvanitakis and P Giannoulis and X S Papageorgiou and C S Tzafestas and P Maragos}, booktitle = {Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications"}, } Elderly people have augmented needs in performing bathing activities, since these tasks require body flexibility. Our aim is to build an assistive robotic bath system, in order to increase the independence and safety of this procedure. Towards this end, the expertise of professional carers for bathing sequences and appropriate motions have to be adopted, in order to achieve natural, physical human - robot interaction. The integration of the communication and verbal interaction between the user and the robot during the bathing tasks is a key issue for such a challenging assistive robotic application. In this paper, we tackle this challenge by developing a novel integrated real-time speech-based perception system, which will provide the necessary assistance to the frail senior citizens. This system can be suitable for installation and use in conventional home or hospital bathroom space. We employ both a speech recognition system with sub-modules to achieve a smooth and robust human-system communication and a low cost depth camera or end-effector motion planning. With a variety of spoken commands, the system can be adapted to the user’s needs and preferences. The instructed by the user washing commands are executed by a robotic manipulator, demonstrating the progress of each task. The smooth integration of ll subsystems is accomplished by a modular and hierarchical decision architecture organized as a Behavior Tree. The system was experimentally tested by successful execution of scenarios from different users with different preferences. |
Velentzas, G; Tzafestas, C; Khamassi, M Bio-inspired meta-learning for active exploration during non-stationary multi-armed bandit tasks Conference Proc. IEEE Intelligent Systems Conference, London, UK, 2017. Abstract | BibTeX | Tags: Active exploration, Bandits, decision making, Kalman filter, Meta-learning, Multi-armed bandit, reinforcement learning | Links: @conference{BFB97, title = {Bio-inspired meta-learning for active exploration during non-stationary multi-armed bandit tasks}, author = {G. Velentzas and C. Tzafestas and M. Khamassi}, booktitle = {Proc. IEEE Intelligent Systems Conference}, } Fast adaptation to changes in the environment requires agents (animals, robots and simulated artefacts) to be able to dynamically tune an exploration-exploitation trade-off during learning. This trade-off usually determines a fixed proportion of exploitative choices (i.e. choice of the action that subjectively appears as best at a given moment) relative to exploratory choices (i.e. testing other actions that now appear worst but may turn out promising later). Rather than using a fixed proportion, non-stationary multi-armed bandit methods in the field of machine learning have proven that principles such as exploring actions that have not been tested for a long time can lead to performance closer to optimal - bounded regret. In parallel, researches in active exploration in the fields of robot learning and computational neuroscience of learning and decision-making have proposed alternative solutions such as transiently increasing exploration in response to drops in average performance, or attributing exploration bonuses specifically to actions associated with high uncertainty in order to gain information when choosing them. In this work, we compare different methods from machine learning, computational neuroscience and robot learning on a set of non-stationary stochastic multi-armed bandit tasks: abrupt shifts; best bandit becomes worst one and vice versa; multiple shifting frequencies. We find that different methods are appropriate in different scenarios. We propose a new hybrid method combining bio-inspired meta-learning, kalman filter and exploration bonuses and show that it outperforms other methods in these scenarios. |
Chalvatzaki, G; Papageorgiou, X S; Tzafestas, C S; Maragos, P Estimating double support in pathological gaits using an HMM-based analyzer for an intelligent robotic walker Conference IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2017. Abstract | BibTeX | Tags: | Links: @conference{CPTM_ROMAN17, title = {Estimating double support in pathological gaits using an HMM-based analyzer for an intelligent robotic walker}, author = {G Chalvatzaki and X S Papageorgiou and C S Tzafestas and P Maragos}, booktitle = {IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)}, pages = {101-106}, } For a robotic walker designed to assist mobility constrained people, it is important to take into account the different spectrum of pathological walking patterns, which result into completely different needs to be covered for each specific user. For a deployable intelligent assistant robot it is necessary to have a precise gait analysis system, providing real-time monitoring of the user and extracting specific gait parameters, which are associated with the rehabilitation progress and the risk of fall. In this paper, we present a completely non-invasive framework for the on-line analysis of pathological human gait and the recognition of specific gait phases and events. The performance of this gait analysis system is assessed, in particular, as related to the estimation of double support phases, which are typically difficult to extract reliably, especially when applying non-wearable and non-intrusive technologies. Furthermore, the duration of double support phases constitutes an important gait parameter and a critical indicator in pathological gait patterns. The performance of this framework is assessed using real data collected from an ensemble of elderly persons with different pathologies. The estimated gait parameters are experimentally validated using ground truth data provided by a Motion Capture system. The results obtained and presented in this paper demonstrate that the proposed human data analysis (modeling, learning and inference) framework has the potential to support efficient detection and classification of specific walking pathologies, as needed to empower a cognitive robotic mobility-assistance device with user-adaptive and context-aware functionalities. |
Tsitsimis, Theodore; Velentzas, George; Khamassi, Mehdi; Tzafestas, Costas Online adaptation to human engagement perturbations in simulated human-robot interaction using hybrid reinforcement learning Conference Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017., Kos, Greece, 2017. Abstract | BibTeX | Tags: adaptation, Human-robot interaction, reinforcement learning | Links: @conference{BFB98, title = {Online adaptation to human engagement perturbations in simulated human-robot interaction using hybrid reinforcement learning}, author = {Theodore Tsitsimis and George Velentzas and Mehdi Khamassi and Costas Tzafestas}, editor = {Michael Aron}, booktitle = {Proc. of the 25th European Signal Processing Conference - Workshop: "MultiLearn 2017 - Multimodal processing, modeling and learning for human-computer/robot interaction applications", Kos, Greece, 2017.}, } Dynamic uncontrolled human-robot interaction requires robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. In a previous work [1] we proposed an active exploration algorithm for reinforcement learning where the reward function is the weighted sum of the human’s current engagement and variations of this engagement (so that a low but increasing engagement is rewarding). We used a structured (parameterized) continuous action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete and continuous action space, enabling the robot to learn which discrete action is expected by the human (e.g. moving an object) and with which velocity of movement. In this paper we want to show the performance of the algorithm to a simulated humanrobot interaction task where a practical approach is followed to estimate human engagement through visual cues of the head pose. We then measure the adaptation of the algorithm to engagement perturbations simulated as changes in the optimal action parameter and we quantify its performance for variations in perturbation duration and measurement noise. |
Velentzas, G; Tzafestas, C; Khamassi, M Bridging Computational Neuroscience and Machine Learning on Non-Stationary Multi-Armed Bandits Miscellaneous bioRxiv, 117598, 2017. Abstract | BibTeX | Tags: Active exploration, Bandits, Decision-making, Kalman filter, Meta-learning, Multi-armed bandit, reinforcement learning | Links: @misc{BFB96, title = {Bridging Computational Neuroscience and Machine Learning on Non-Stationary Multi-Armed Bandits}, author = {G. Velentzas and C. Tzafestas and M. Khamassi }, } Fast adaptation to changes in the environment requires both natural and artificial agents to be able to dynamically tune an exploration-exploitation trade-off during learning. This trade-off usually determines a fixed proportion of exploitative choices (i.e. choice of the action that subjectively appears as best at a given moment) relative to exploratory choices (i.e. testing other actions that now appear worst but may turn out promising later). The problem of finding an efficient exploration-exploitation trade-off has been well studied both in the Machine Learning and Computational Neuroscience fields. Rather than using a fixed proportion, non-stationary multi-armed bandit methods in the former have proven that principles such as exploring actions that have not been tested for a long time can lead to performance closer to optimal - bounded regret. In parallel, researches in the latter have investigated solutions such as progressively increasing exploita- tion in response to improvements of performance, transiently increasing exploration in response to drops in average performance, or attributing exploration bonuses specifically to actions associated with high uncertainty in order to gain information when performing these actions. In this work, we first try to bridge some of these different methods from the two research fields by rewriting their decision process with a common formalism. We then show numerical simulations of a hybrid algorithm combining bio-inspired meta-learning, kalman filter and exploration bonuses compared to several state-of-the-art alternatives on a set of non-stationary stochastic multi-armed bandit tasks. While we find that different methods are appropriate in different scenarios, the hybrid algorithm displays a good combination of advantages from different methods and outperforms these methods in the studied scenarios. |
Chalvatzaki, G; Papageorgiou, X S; Tzafestas, C S; Maragos, P Comparative experimental validation of human gait tracking algorithms for an intelligent robotic rollator Conference IEEE International Conference on Robotics and Automation (ICRA), 2017. Abstract | BibTeX | Tags: | Links: @conference{CPTM_ICRA17, title = {Comparative experimental validation of human gait tracking algorithms for an intelligent robotic rollator}, author = {G Chalvatzaki and X S Papageorgiou and C S Tzafestas and P Maragos}, booktitle = {IEEE International Conference on Robotics and Automation (ICRA)}, pages = {6026-6031}, } Tracking human gait accurately and robustly constitutes a key factor for a smart robotic walker, aiming to provide assistance to patients with different mobility impairment. A context-aware assistive robot needs constant knowledge of the user's kinematic state to assess the gait status and adjust its movement properly to provide optimal assistance. In this work, we experimentally validate the performance of two gait tracking algorithms using data from elderly patients; the first algorithm employs a Kalman Filter (KF), while the second one tracks the user legs separately using two probabilistically associated Particle Filters (PFs). The algorithms are compared according to their accuracy and robustness, using data captured from real experiments, where elderly subjects performed specific walking scenarios with physical assistance from a prototype Robotic Rollator. Sensorial data were provided by a laser rangefinder mounted on the robotic platform recording the movement of the user's legs. The accuracy of the proposed algorithms is analysed and validated with respect to ground truth data provided by a Motion Capture system tracking a set of visual markers worn by the patients. The robustness of the two tracking algorithms is also analysed comparatively in a complex maneuvering scenario. Current experimental findings demonstrate the superior performance of the PFs in difficult cases of occlusions and clutter, where KF tracking often fails. |
Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task Conference Proc. IEEE Int'l Conference on Robotic Computing, Taichung, Taiwan, 2017. Abstract | BibTeX | Tags: Active exploration, Human-robot interaction, Meta-learning, Multi-Arm bandit, reinforcement learning | Links: @conference{BFB95, title = {Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task}, booktitle = {Proc. IEEE Int'l Conference on Robotic Computing}, } Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature β parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and long-term reward running averages to simultaneously tune β and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-the-art performance in the non-stationary multi-armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm. |
Zlatintsi, A; Rodomagoulakis, I; Pitsikalis, V; Koutras, P; Kardaris, N; Papageorgiou, X; Tzafestas, C; Maragos, P Social Human-Robot Interaction for the Elderly: Two Real-life Use Cases, Conference ACM/IEEE International Conference on Human-Robot Interaction (HRI), Vienna, Austria, 2017. Abstract | BibTeX | Tags: Assistive HRI, Multimodal audio-gestural recognition | Links: @conference{ZRP+17, title = {Social Human-Robot Interaction for the Elderly: Two Real-life Use Cases,}, author = {A Zlatintsi and I Rodomagoulakis and V Pitsikalis and P Koutras and N Kardaris and X Papageorgiou and C Tzafestas and P Maragos}, booktitle = {ACM/IEEE International Conference on Human-Robot Interaction (HRI)}, } We explore new aspects on assistive living via smart social human-robot interaction (HRI) involving automatic recognition of multimodal gestures and speech in a natural interface, providing social features in HRI. We discuss a whole framework of resources, including datasets and tools, briefly shown in two real-life use cases for elderly subjects: a multimodal interface of an assistive robotic rollator and an assistive bathing robot. We discuss these domain specific tasks, and open source tools, which can be used to build such HRI systems, as well as indicative results. Sharing such resources can open new perspectives in assistive HRI. |
Katsamanis, Athanasios; Pitsikalis, Vassilis; Theodorakis, Stavros; Maragos, Petros Multimodal Gesture Recognition Book Chapter The Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations - Volume 1, pp. 449–487, Association for Computing Machinery and Morgan & Claypool, 2017, ISBN: 9781970001679. BibTeX | Tags: | Links: @inbook{10.1145/3015783.3015796, title = {Multimodal Gesture Recognition}, author = {Athanasios Katsamanis and Vassilis Pitsikalis and Stavros Theodorakis and Petros Maragos}, booktitle = {The Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations - Volume 1}, pages = {449–487}, publisher = {Association for Computing Machinery and Morgan & Claypool}, } |
Charisopoulos, Vasileios; Maragos, Petros Morphological perceptrons: Geometry and training algorithms Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10225 LNCS , 2017, ISSN: 16113349. Abstract | BibTeX | Tags: Machine learning, Mathematical morphology, Neural networks, Optimization, Tropical geometry | Links: @conference{346, title = {Morphological perceptrons: Geometry and training algorithms}, author = { Vasileios Charisopoulos and Petros Maragos}, booktitle = {Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)}, volume = {10225 LNCS}, pages = {3--15}, } Neural networks have traditionally relied on mostly linear models, such as the multiply-accumulate architecture of a linear perceptron that remains the dominant paradigm of neuronal computation. However, from a biological standpoint, neuron activity may as well involve inherently nonlinear and competitive operations. Mathematical morphology and minimax algebra provide the necessary background in the study of neural networks made up from these kinds of nonlinear units. This paper deals with such a model, called the morphological perceptron. We study some of its geometrical properties and introduce a training algorithm for binary classification. We point out the relationship between morphological classifiers and the recent field of tropical geometry, which enables us to obtain a precise bound on the number of linear regions of the maxout unit, a popular choice for deep neural networks introduced recently. Finally, we present some relevant numerical results. |
Khamassi, Mehdi; Velentzas, George; Tsitsimis, Theodore; Tzafestas, Costas Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task Conference Proceedings - 2017 1st IEEE International Conference on Robotic Computing, IRC 2017, 2017, ISBN: 9781509067237. Abstract | BibTeX | Tags: Active exploration, Human-robot interaction, Meta-learning, Multi-Arm bandit, reinforcement learning | Links: @conference{337, title = {Active exploration and parameterized reinforcement learning applied to a simulated human-robot interaction task}, author = { Mehdi Khamassi and George Velentzas and Theodore Tsitsimis and Costas Tzafestas}, booktitle = {Proceedings - 2017 1st IEEE International Conference on Robotic Computing, IRC 2017}, pages = {28--35}, } textcopyright 2017 IEEE. Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature $beta$ parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-Term and long-Term reward running averages to simultaneously tune $beta$ and the width of the Gaussian distribution from which continuous action parameters are drawn. We first show that this algorithm reaches state-of-The-Art performance in the non-stationary multi-Armed bandit paradigm, while also being generalizable to continuous actions and multi-step tasks. We then apply it to a simulated human-robot interaction task, and show that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm. |
Sakaridis, Christos; Drakopoulos, Kimon; Maragos, Petros Theoretical Analysis of Active Contours on Graphs Journal Article SIAM J. Imaging Sciences, 2017, ISSN: 1936-4954. Abstract | BibTeX | Tags: | Links: @article{344, title = {Theoretical Analysis of Active Contours on Graphs}, author = {Christos Sakaridis and Kimon Drakopoulos and Petros Maragos}, journal = {SIAM J. Imaging Sciences}, } Active contour models based on partial differential equations have proved successful in image segmentation, yet the study of their geometric formulation on arbitrary geometric graphs is still at an early stage. In this paper, we introduce geometric approximations of gradient and curvature, which are used in the geodesic active contour model. We prove convergence in probability of our gradient approximation to the true gradient value and derive an asymptotic upper bound for the error of this approximation for the class of random geometric graphs. Two different approaches for the approximation of curvature are presented and both are also proved to converge in probability in the case of random geometric graphs. We propose neighborhood-based filtering on graphs to improve the accuracy of the aforementioned approximations and define two variants of Gaussian smoothing on graphs which include normalization in order to adapt to graph non-uniformities. The performance of our active contour framework on graphs is demonstrated in the segmentation of regular images and geographical data defined on arbitrary graphs. |
Bampis, Christos G; Maragos, Petros; Bovik, Alan C Graph-driven diffusion and random walk schemes for image segmentation Journal Article IEEE Transactions on Image Processing, 26 (1), pp. 35–50, 2017, ISSN: 10577149. Abstract | BibTeX | Tags: diffusion modeling, Graph clustering, Image segmentation, random walker, SIR epidemic propagation model | Links: @article{327, title = {Graph-driven diffusion and random walk schemes for image segmentation}, author = {Christos G Bampis and Petros Maragos and Alan C Bovik}, journal = {IEEE Transactions on Image Processing}, volume = {26}, number = {1}, pages = {35--50}, } — We propose graph-driven approaches to image segmentation by developing diffusion processes defined on arbi-trary graphs. We formulate a solution to the image segmentation problem modeled as the result of infectious wavefronts prop-agating on an image-driven graph, where pixels correspond to nodes of an arbitrary graph. By relating the popular susceptible-infected-recovered epidemic propagation model to the Random Walker algorithm, we develop the normalized random walker and a lazy random walker variant. The underlying iterative solutions of these methods are derived as the result of infec-tions transmitted on this arbitrary graph. The main idea is to incorporate a degree-aware term into the original Random Walker algorithm in order to account for the node centrality of every neighboring node and to weigh the contribution of every neighbor to the underlying diffusion process. Our lazy random walk variant models the tendency of patients or nodes to resist changes in their infection status. We also show how previous work can be naturally extended to take advantage of this degree-aware term, which enables the design of other novel methods. Through an extensive experimental analysis, we demonstrate the reliability of our approach, its small computational burden and the dimensionality reduction capabilities of graph-driven approaches. Without applying any regular grid constraint, the proposed graph clustering scheme allows us to consider pixel-level, node-level approaches, and multidimensional input data by naturally integrating the importance of each node to the final clustering or segmentation solution. A software release containing implementations of this paper and supplementary material can be found at: http://cvsp.cs.ntua.gr/research/GraphClustering/. |
Filntisis, Panagiotis Paraskevas; Katsamanis, Athanasios; Tsiakoulis, Pirros; Maragos, Petros Video-realistic expressive audio-visual speech synthesis for the Greek language Journal Article Speech Communication, 95 , pp. 137–152, 2017, ISSN: 01676393. Abstract | BibTeX | Tags: adaptation, Audio-visual speech synthesis, Deep neural networks, Expressive, hidden Markov models, Interpolation | Links: @article{345, title = {Video-realistic expressive audio-visual speech synthesis for the Greek language}, author = {Panagiotis Paraskevas Filntisis and Athanasios Katsamanis and Pirros Tsiakoulis and Petros Maragos}, journal = {Speech Communication}, volume = {95}, pages = {137--152}, } High quality expressive speech synthesis has been a long-standing goal towards natural human-computer interaction. Generating a talking head which is both realistic and expressive appears to be a considerable challenge, due to both the high complexity in the acoustic and visual streams and the large non-discrete number of emotional states we would like the talking head to be able to express. In order to cover all the desired emotions, a significant amount of data is required, which poses an additional time-consuming data collection challenge. In this paper we attempt to address the aforementioned problems in an audio-visual context. Towards this goal, we propose two deep neural network (DNN) architectures for Video-realistic Expressive Audio-Visual Text-To-Speech synthesis (EAVTTS) and evaluate them by comparing them directly both to traditional hidden Markov model (HMM) based EAVTTS, as well as a concatenative unit selection EAVTTS approach, both on the realism and the expressiveness of the generated talking head. Next, we investigate adaptation and interpolation techniques to address the problem of covering the large emotional space. We use HMM interpolation in order to generate different levels of intensity for an emotion, as well as investigate whether it is possible to generate speech with intermediate speaking styles between two emotions. In addition, we employ HMM adaptation to adapt an HMM-based system to another emotion using only a limited amount of adaptation data from the target emotion. We performed an extensive experimental evaluation on a medium sized audio-visual corpus covering three emotions, namely anger, sadness and happiness, as well as neutral reading style. Our results show that DNN-based models outperform HMMs and unit selection on both the realism and expressiveness of the generated talking heads, while in terms of adaptation we can successfully adapt an audio-visual HMM set trained on a neutral speaking style database to a target emotion. Finally, we show that HMM interpolation can indeed generate different levels of intensity for EAVTTS by interpolating an emotion with the neutral reading style, as well as in some cases, generate audio-visual speech with intermediate expressions between two emotions. |
Papageorgiou, X S; Chalvatzaki, G; Dometios, A; Tzafestas, C S; Maragos, P Intelligent Assistive Robotic Systems for the Elderly: Two Real-life Use Cases Conference C_PETRA, ACM, Island of Rhodes, Greece, 2017, ISBN: 978-1-4503-5227-7. Abstract | BibTeX | Tags: Assistive HRI, Bathing Assistance, Intelligent Assistive Robotic Systems, Mobility Assistance | Links: @conference{PETRA2017, title = {Intelligent Assistive Robotic Systems for the Elderly: Two Real-life Use Cases}, author = {X S Papageorgiou and G Chalvatzaki and A Dometios and C S Tzafestas and P Maragos}, booktitle = {C_PETRA}, pages = {360--365}, publisher = {ACM}, } Mobility impairments are prevalent in the elderly population and constitute one of the main causes related to difficulties in performing Activities of Daily Living (ADLs) and consequent reduction of quality of life. When designing a user-friendly assistive device for mobility constrained people, it is important to take into account the diverse spectrum of disabilities, which results into completely different needs to be covered by the device for each specific user. An intelligent adaptive behavior is necessary for the deployment of such systems. Also, elderly people have particular needs in specific case of performing bathing activities, since these tasks require body flexibility. We explore new aspects of assistive living via intelligent assistive robotic systems involving human robot interaction in a natural interface. Our aim is to build assistive robotic systems, in order to increase the independence and safety of these procedures. Towards this end, the expertise of professional carers for walking or bathing sequences and appropriate motions have to be adopted, in order to achieve natural, physical human - robot interaction. Our goal is to report current research work related to the development of two real-life use cases of intelligent robotic systems for elderly aiming to provide user-adaptive and context-aware assistance. |
Karamanolakis, G; Iosif, E; Zlatintsi, A; Pikrakis, A; Potamianos, A Audio-based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement Conference Proc. MultiLearn2017: Multimodal Processing, Modeling and Learning for Human-Computer/Robot Interaction Workshop, in conjuction with European Signal Processing Conference, Kos, Greece, 2017. Abstract | BibTeX | Tags: | Links: @conference{KIZ+17, title = {Audio-based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement}, author = {G Karamanolakis and E Iosif and A Zlatintsi and A Pikrakis and A Potamianos}, booktitle = {Proc. MultiLearn2017: Multimodal Processing, Modeling and Learning for Human-Computer/Robot Interaction Workshop, in conjuction with European Signal Processing Conference}, } The recent development of Audio-based Distributional Semantic Models (ADSMs) enables the computation of audio and lexical vector representations in a joint acoustic-semantic space. In this work, these joint representations are applied to the problem of automatic tag generation. The predicted tags together with their corresponding acoustic representation are exploited for the construction of acoustic-semantic clip embeddings. The proposed algorithms are evaluated on the task of similarity measurement between music clips. Acoustic-semantic models are shown to outperform the state-of-the-art for this task and produce high quality tags for audio/music clips. |
Zlatintsi, A; Koutras, P; Evangelopoulos, G; Malandrakis, N; Efthymiou, N; Pastra, K; Potamianos, A; Maragos, P COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization Journal Article EURASIP Journal on Image and Video Processing, 54 , pp. 1–24, 2017. Abstract | BibTeX | Tags: Audio-visual events, Cross-media relations, Emotion annotation, Saliency, Video database, Video summarization | Links: @article{ZKE+17, title = {COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization}, author = {A Zlatintsi and P Koutras and G Evangelopoulos and N Malandrakis and N Efthymiou and K Pastra and A Potamianos and P Maragos}, journal = {EURASIP Journal on Image and Video Processing}, volume = {54}, pages = {1--24}, } Research related to computational modeling for machine-based understanding requires ground truth data for training, content analysis, and evaluation. In this paper, we present a multimodal video database, namely COGNIMUSE, annotated with sensory and semantic saliency, events, cross-media semantics, and emotion. The purpose of this database is manifold; it can be used for training and evaluation of event detection and summarization algorithms, for classification and recognition of audio-visual and cross-media events, as well as for emotion tracking. In order to enable comparisons with other computational models, we propose state-of-the-art algorithms, specifically a unified energy-based audio-visual framework and a method for text saliency computation, for the detection of perceptually salient events from videos. Additionally, a movie summarization system for the automatic production of summaries is presented. Two kinds of evaluation were performed, an objective based on the saliency annotation of the database and an extensive qualitative human evaluation of the automatically produced summaries, where we investigated what composes high-quality movie summaries, where both methods verified the appropriateness of the proposed methods. The annotation of the database and the code for the summarization system can be found at http://cognimuse.cs.ntua.gr/database. |
2016 |
Kardaris, N; Rodomagoulakis, I; Pitsikalis, V; Arvanitakis, A; Maragos, P A Platform for Building New Human-Computer Interface Systems that Support Online Automatic Recognition of Audio-Gestural Commands Conference Proceedings of the 2017 ACM on Multimedia Conference, Amsterdam, The Netherlands, 2016. Abstract | BibTeX | Tags: Accessibility, Auditory feedback, Gestural input, Human computer interaction (HCI) | Links: @conference{acm_kardaris_2016, title = {A Platform for Building New Human-Computer Interface Systems that Support Online Automatic Recognition of Audio-Gestural Commands}, author = {N Kardaris and I Rodomagoulakis and V Pitsikalis and A Arvanitakis and P Maragos}, booktitle = {Proceedings of the 2017 ACM on Multimedia Conference}, } We introduce a new framework to build human-computer interfaces that provide online automatic audio-gestural command recognition. The overall system allows the construction of a multimodal interface that recognizes user input expressed naturally as audio commands and manual gestures, captured by sensors such as Kinect. It includes a component for acquiring multimodal user data which is used as input to a module responsible for training audio-gestural models. These models are employed by the automatic recognition component, which supports online recognition of audiovisual modalities. The overall framework is exemplified by a working system use case. This demonstrates the potential of the overall software platform, which can be employed to build other new human-computer interaction systems. Moreover, users may populate libraries of models and/or data that can be shared in the network. In this way users may reuse or extend existing systems. |
Guler, A; Kardaris, N; Chandra, S; Pitsikalis, V; Werner, C; Hauer, K; Tzafestas, C; Maragos, P; Kokkinos, I Human Joint Angle Estimation and Gesture Recognition for Assistive Robotic Vision Conference Proc. of Workshop on Assistive Computer Vision and Robotics, European Conf. on Computer Vision (ECCV-2016), Amsterdam, The Netherlands, 2016. Abstract | BibTeX | Tags: assistive living, gesture recognition, joint angle, kinect sensor, pose estimation | Links: @conference{guler_joint_gesture_2016, title = {Human Joint Angle Estimation and Gesture Recognition for Assistive Robotic Vision}, author = {A Guler and N Kardaris and S Chandra and V Pitsikalis and C Werner and K Hauer and C Tzafestas and P Maragos and I Kokkinos}, booktitle = {Proc. of Workshop on Assistive Computer Vision and Robotics, European Conf. on Computer Vision (ECCV-2016)}, } We explore new directions for automatic human gesture recognition and human joint angle estimation as applied for human-robot interaction in the context of an actual challenging task of assistive living for real-life elderly subjects. Our contributions include state-of-the-art approaches for both low- and mid-level vision, as well as for higher level action and gesture recognition. The first direction investigates a deep learning based framework for the challenging task of human joint angle estimation on noisy real world RGB-D images. The second direction includes the employment of dense trajectory features for online processing of videos for automatic gesture recognition with real-time performance. Our approaches are evaluated both qualitative and quantitatively on a newly acquired dataset that is constructed on a challenging real-life scenario on assistive living for elderly subjects. |
Kardaris, N; Pitsikalis, V; Mavroudi, E; Maragos, P Introducing Temporal Order of Dominant Visual Word Sub-Sequences for Human Action Recognition Conference Proc. of IEEE Int'l Conf. on Image Processing (ICIP-2016), Phoenix, AZ, USA, 2016. Abstract | BibTeX | Tags: bag-of-visual words, local sub-sequence alignment, temporal sequences, video representation, visual human action recognition | Links: @conference{acm_kardaris_2016b, title = {Introducing Temporal Order of Dominant Visual Word Sub-Sequences for Human Action Recognition}, author = {N Kardaris and V Pitsikalis and E Mavroudi and P Maragos}, booktitle = {Proc. of IEEE Int'l Conf. on Image Processing (ICIP-2016)}, } We present a novel video representation for human action recognition by considering temporal sequences of visual words. Based on state-of-the-art dense trajectories, we introduce temporal bundles of dominant, that is most frequent, visual words. These are employed to construct a complementary action representation of ordered dominant visual word sequences, that additionally incorporates fine-grained temporal information. We exploit the introduced temporal information by applying local sub-sequence alignment that quantifies the similarity between sequences. This facilitates the fusion of our representation with the bag-of-visual-words (BoVW) representation. Our approach incorporates sequential temporal structure and results in a low-dimensional representation compared to the BoVW, while still yielding a descent result when combined with it. Experiments on the KTH, Hollywood2 and the challenging HMDB51 datasets show that the proposed framework is complementary to the BoVW representation, which discards temporal order |
Panagiotaropoulou, G; Koutras, P; Katsamanis, A; Maragos, P; Zlatintsi, A; Protopapas, A; Karavasilis, E; Smyrnis, N fMRI-based Perceptual Validation of a computational Model for Visual and Auditory Saliency in Videos Conference Proc. {IEEE} Int'l Conf. Acous., Speech, and Signal Processing, Phoenix, AZ, USA, 2016. Abstract | BibTeX | Tags: AM-FM sound analysis, auditory saliency, f{MRI}, General Linear Model, Spatio-temporal Gabor energy filterbank, Visual saliency | Links: @conference{PKK+16, title = {fMRI-based Perceptual Validation of a computational Model for Visual and Auditory Saliency in Videos}, author = {G Panagiotaropoulou and P Koutras and A Katsamanis and P Maragos and A Zlatintsi and A Protopapas and E Karavasilis and N Smyrnis}, booktitle = {Proc. {IEEE} Int'l Conf. Acous., Speech, and Signal Processing}, } In this study, we make use of brain activation data to investigate the perceptual plausibility of a visual and an auditory model for visual and auditory saliency in video processing. These models have already been successfully employed in a number of applications. In addition, we experiment with parameters, modifications and suitable fusion schemes. As part of this work, fMRI data from complex video stimuli were collected, on which we base our analysis and results. The core part of the analysis involves the use of well-established methods for the manipulation of fMRI data and the examination of variability across brain responses of different individuals. Our results indicate a success in confirming the value of these saliency models in terms of perceptual plausibility. |
Chalvatzaki, G; Papageorgiou, X S; Werner, C; Hauer, K; Tzafestas, C S; Maragos, P Experimental comparison of human gait tracking algorithms: Towards a context-aware mobility assistance robotic walker Conference Mediterranean Conference on Control and Automation (MED), 2016. Abstract | BibTeX | Tags: | Links: @conference{CPWHTM16, title = {Experimental comparison of human gait tracking algorithms: Towards a context-aware mobility assistance robotic walker}, author = {G Chalvatzaki and X S Papageorgiou and C Werner and K Hauer and C S Tzafestas and P Maragos}, booktitle = {Mediterranean Conference on Control and Automation (MED)}, pages = {719-724}, } Towards a mobility assistance robot for the elderly, it is essential to develop a robust and accurate gait tracking system. Various pathologies cause mobility inabilities to the aged population, leading to different gait patterns and walking speed. In this work, we present the experimental comparison of two user leg tracking systems of a robotic assistance walker, using data collected by a laser range sensor. The first one is a Kalman Filter tracking system, while the second one proposes the use of Particle Filters. The tracking systems provide the positions and velocities of the user's legs, which are used as observations into an HMM-based gait phases recognition system. The spatiotemporal results of the HMM framework are employed for computing parameters that characterize the human motion, which subsequently can be used to assess and distinguish between possible motion disabilities. For the experimental comparison, we are using real data collected from an ensemble of different elderly persons with a number of pathologies, and ground truth data from a GaitRite System. The results presented in this work, demonstrate the applicability of the tracking systems in real test cases. |
Dometios, A C; Papageorgiou, X S; Tzafestas, C S; Vartholomeos, P Towards ICT-supported Bath Robots: Control Architecture Description and Localized Perception of User for Robot Motion Planning Conference Mediterranean Conference on Control and Automation (MED), Athens, Greece, 2016. Abstract | BibTeX | Tags: | Links: @conference{DPTV16, title = {Towards ICT-supported Bath Robots: Control Architecture Description and Localized Perception of User for Robot Motion Planning}, author = {A C Dometios and X S Papageorgiou and C S Tzafestas and P Vartholomeos}, booktitle = {Mediterranean Conference on Control and Automation (MED)}, pages = {713-718}, } Τhis paper describes the general control architecture and the basic implementation concepts of a bath service robotic system. The goal of this system is to support and enhance elderly’s mobility, manipulation and force exertion abilities and assist them in successfully, safely and independently completing the entire sequence of showering and drying tasks, such as properly washing their back and lower limbs. This service robotic system is based on soft-robotic arms which,together with advanced human-robot force/compliance control will form the basis for a safe physical human-robot interaction that complies with the most up-to-date safety standards. In this paper an overview of the bath robotic system components is presented, and the basic modules that contribute to the overall control architecture of the system are described. Moreover, this paper proposed an algorithm that performs efficient processing of feedback data provided by a depth sensor. This algorithm supports local shape perception and geometric characterization of user body parts and will form the basis for further implementation of surface reconstruction and robot motion planning algorithms. |
Papageorgiou, X S; Chalvatzaki, G; Lianos, K N; Werner, C; Hauer, K; Tzafestas, C S; Maragos, P Experimental validation of human pathological gait analysis for an assisted living intelligent robotic walker Conference C_BIOROB, 2016. Abstract | BibTeX | Tags: | Links: @conference{BIOROB2016, title = {Experimental validation of human pathological gait analysis for an assisted living intelligent robotic walker}, author = {X S Papageorgiou and G Chalvatzaki and K N Lianos and C Werner and K Hauer and C S Tzafestas and P Maragos}, booktitle = {C_BIOROB}, pages = {1086-1091}, } A robust and effective gait analysis functionality is an essential characteristic for an assistance mobility robot dealing with elderly persons. The aforementioned functionality is crucial for dealing with mobility disabilities which are widespread in these parts of the population. In this work we present experimental validation of our in house developed system. We are using real data, collected from an ensemble of different elderly persons with a number of pathologies, and we present a validation study by using a GaitRite System. Our system, following the standard literature conventions, characterizes the human motion with a set of parameters which subsequently can be used to assess and distinguish between possible motion disabilities, using a laser range finder as its main sensor. The initial results, presented in this work, demonstrate the applicability of our framework in real test cases. Regarding such frameworks, a crucial technical question is the necessary complexity of the overall tracking system. To answer this question, we compare two approaches with different complexity levels. The first is a static rule based system acting on filtered laser data, while the second system utilizes a Hidden Markov Model for gait cycle estimation, and extraction of the gait parameters. The results demonstrate that the added complexity of the HMM system is necessary for improving the accuracy and efficacy of the system. |
Tsiami, A; Katsamanis, A; Maragos, P; Vatakis, A Towards a behaviorally-validated computational audiovisual saliency model Conference Proc. IEEE Int'l Conf. Acous., Speech, and Signal Processing, Shanghai, China, 2016. Abstract | BibTeX | Tags: | Links: @conference{7472197, title = {Towards a behaviorally-validated computational audiovisual saliency model}, author = {A Tsiami and A Katsamanis and P Maragos and A Vatakis}, booktitle = {Proc. IEEE Int'l Conf. Acous., Speech, and Signal Processing}, pages = {2847-2851}, } Computational saliency models aim at predicting, in a bottom-up fashion, where human attention is drawn in the presented (visual, auditory or audiovisual) scene and have been proven useful in applications like robotic navigation, image compression and movie summarization. Despite the fact that well-established auditory and visual saliency models have been validated in behavioral experiments, e.g., by means of eye-tracking, there is no established computational audiovisual saliency model validated in the same way. In this work, building on biologically-inspired models of visual and auditory saliency, we present a joint audiovisual saliency model and introduce the validation approach we follow to show that it is compatible with recent findings of psychology and neuroscience regarding multimodal integration and attention. In this direction, we initially focus on the "pip and pop" effect which has been observed in behavioral experiments and indicates that visual search in sequences of cluttered images can be significantly aided by properly timed non-spatial auditory signals presented alongside the target visual stimuli. |
Maragos, Petros; Pitsikalis, Vassilis; Katsamanis, Athanasios; Pavlakos, George; Theodorakis, Stavros On Shape Recognition and Language Inproceedings Breuß, Michael; Bruckstein, Alfred; Maragos, Petros; Wuhrer, Stefanie (Ed.): Perspectives in Shape Analysis, pp. 321–344, Springer International Publishing, Cham, 2016, ISBN: 978-3-319-24726-7. @inproceedings{10.1007/978-3-319-24726-7_15, title = {On Shape Recognition and Language}, author = {Petros Maragos and Vassilis Pitsikalis and Athanasios Katsamanis and George Pavlakos and Stavros Theodorakis}, editor = {Michael Breuß and Alfred Bruckstein and Petros Maragos and Stefanie Wuhrer}, booktitle = {Perspectives in Shape Analysis}, pages = {321--344}, publisher = {Springer International Publishing}, } Shapes shapeconvey meaning. Language is efficient in expressing and structuring meaning. The main thesis of this chapter is that by integrating shape with linguistic information shape recognition can be improved in performance. It broadens the concept of shape to visual shapes that include both geometric and optical information and explores ways that additional linguistic information may help with shape recognition. Towards this goal, it briefly describes some shape categories which have the potential of better recognition via language, with emphasis on gestures and moving shapes of sign language, as well as on cross-modal relations between vision and language in videos. It also draws inspiration from psychological studies that explore connections between gestures and human languages. Afterwards, it focuses on the broad class of multimodal gestures that combine spatio-temporal visual shapes with audio information. In this area, an approach is reviewed that significantly improves multimodal gesture recognition by fusing 3D shape information from motion-position of gesturing hands/arms and spatio-temporal handshapes in color and depth visual channels with audio information in the form of acoustically recognized sequences of gesture words. |
Bampis, Christos G; Maragos, Petros; Bovik, Alan C Projective non-negative matrix factorization for unsupervised graph clustering Conference Proceedings - International Conference on Image Processing, ICIP, 2016-August , 2016, ISSN: 15224880. BibTeX | Tags: | Links: @conference{328, title = {Projective non-negative matrix factorization for unsupervised graph clustering}, author = { Christos G. Bampis and Petros Maragos and Alan C. Bovik}, booktitle = {Proceedings - International Conference on Image Processing, ICIP}, volume = {2016-August}, pages = {1255--1258}, } |
Panagiotaropoulou, Georgia; Koutras, Petros; Katsamanis, Athanasios; Maragos, Petros; Zlatintsi, Athanasia; Protopapas, Athanassios; Karavasilis, Efstratios; Smyrnis, Nikolaos FMRI-based perceptual validation of a computational model for visual and auditory saliency in videos Conference Proceedings - International Conference on Image Processing, ICIP, 2016-August , 2016, ISSN: 15224880. Abstract | BibTeX | Tags: AM-FM sound analysis, auditory saliency, FMRI, General Linear Model, Spatio-temporal Gabor energy filterbank, Visual saliency | Links: @conference{332, title = {FMRI-based perceptual validation of a computational model for visual and auditory saliency in videos}, author = { Georgia Panagiotaropoulou and Petros Koutras and Athanasios Katsamanis and Petros Maragos and Athanasia Zlatintsi and Athanassios Protopapas and Efstratios Karavasilis and Nikolaos Smyrnis}, booktitle = {Proceedings - International Conference on Image Processing, ICIP}, volume = {2016-August}, pages = {699--703}, } textcopyright 2016 IEEE.In this study, we make use of brain activation data to investigate the perceptual plausibility of a visual and an auditory model for visual and auditory saliency in video processing. These models have already been successfully employed in a number of applications. In addition, we experiment with parameters, modifications and suitable fusion schemes. As part of this work, fMRI data from complex video stimuli were collected, on which we base our analysis and results. The core part of the analysis involves the use of well-established methods for the manipulation of fMRI data and the examination of variability across brain responses of different individuals. Our results indicate a success in confirming the value of these saliency models in terms of perceptual plausibility. |
Rodomagoulakis, I; Kardaris, N; Pitsikalis, V; Arvanitakis, A; Maragos, P A multimedia gesture dataset for human robot communication: Acquisition, tools and recognition results Conference Proceedings - International Conference on Image Processing, ICIP, 2016-August , 2016, ISSN: 15224880. Abstract | BibTeX | Tags: Audio commands, Human-robot communication, Multimedia gesture dataset, Visual gesture recognition | Links: @conference{334, title = {A multimedia gesture dataset for human robot communication: Acquisition, tools and recognition results}, author = { I. Rodomagoulakis and N. Kardaris and V. Pitsikalis and A. Arvanitakis and P. Maragos}, booktitle = {Proceedings - International Conference on Image Processing, ICIP}, volume = {2016-August}, pages = {3066--3070}, } Motivated by the recent advances in human-robot interaction we present a new dataset, a suite of tools to handle it and state-of-the-art work on visual gestures and audio commands recognition. The dataset has been collected with an integrated annotation and acquisition web-interface that facilitates on-the-way temporal ground-truths for fast acquisition. The dataset includes gesture instances in which the subjects are not in strict setup positions, and contains multiple scenarios, not restricted to a single static configuration. We accompany it by a valuable suite of tools as the practical interface to acquire audio-visual data in the robotic operating system, a state-of-the-art learning pipeline to train visual gesture and audio command models, and an online gesture recognition system. Finally, we include a rich evaluation of the dataset providing rich and insightfull experimental recognition results. |
Karigiannis, John N; Tzafestas, Costas S Model-free learning on robot kinematic chains using a nested multi-agent topology Journal Article Journal of Experimental and Theoretical Artificial Intelligence, 28 (6), pp. 913–954, 2016, ISSN: 13623079. Abstract | BibTeX | Tags: multi-agent architecture; fuzzy reinforcement learning; developmental robotics; dexterous robotic manipulation | Links: @article{321, title = {Model-free learning on robot kinematic chains using a nested multi-agent topology}, author = {John N Karigiannis and Costas S Tzafestas}, journal = {Journal of Experimental and Theoretical Artificial Intelligence}, volume = {28}, number = {6}, pages = {913--954}, } This paper proposes a model-free learning scheme for the developmental acquisition of robot kinematic control and dexterous manipulation skills. The approach is based on a nested-hierarchical multi-agent architecture that intuitively encapsulates the topology of robot kinematic chains, where the activity of each independent degree-of-freedom (DOF) is finally mapped onto a distinct agent. Each one of those agents progressively evolves a local kinematic control strategy in a game-theoretic sense, that is, based on a partial (local) view of the whole system topology, which is incrementally updated through a recursive communication process according to the nested-hierarchical topology. Learning is thus approached not through demonstration and training but through an autonomous self-exploration process. A fuzzy reinforcement learning scheme is employed within each agent to enable efficient exploration in a continuous state–action domain. This paper constitutes in fact a proof of concept, demonstrating that glo... |
Karamanolakis, G; Iosif, E; Zlatintsi, A; Pikrakis, A; Potamianos, A Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings Conference 2016. Abstract | BibTeX | Tags: Audio representations, Bag-of-audio-words, Feature space fusion, Lexical semantic similarity | Links: @conference{KIZ+16, title = {Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings}, author = {G Karamanolakis and E Iosif and A Zlatintsi and A Pikrakis and A Potamianos}, } Recently a “Bag-of-Audio-Words” approach was proposed [1] for the combination of lexical features with audio clips in a multimodal semantic representation, i.e., an Audio Distributional Semantic Model (ADSM). An important step towards the creation of ADSMs is the estimation of the semantic distance between clips in the acoustic space, which is especially challenging given the diversity of audio collections. In this work, we investigate the use of different feature encodings in order to address this challenge following a two-step approach. First, an audio clip is categorized with respect to three classes, namely, music, speech and other. Next, the feature encodings are fused according to the posterior probabilities estimated in the previous step. Using a collection of audio clips annotated with tags we derive a mapping between words and audio clips. Based on this mapping and the proposed audio semantic distance, we construct an ADSM model in order to compute the distance between words (lexical semantic similarity task). The proposed model is shown to significantly outperform (23.6% relative improvement in correlation coefficient) the state-of-the-art results reported in the literature. |
2015 |
Papageorgiou, X S; Tzafestas, C S; Laschi, Vartholomeos C; Lopez, R ICT-Supported Bath Robots: Design Concepts Conference C_ICSR, 2015. Abstract | BibTeX | Tags: | Links: @conference{ICSR2015_1, title = {ICT-Supported Bath Robots: Design Concepts}, author = {X S Papageorgiou and C S Tzafestas and C Vartholomeos Laschi and R Lopez}, booktitle = {C_ICSR}, } This paper presents the concept and the architecture of the I-SUPPORT service robotics system. The goal of the I-SUPPORT system is to support and enhance older adults mobility, manipulation and force exertion abilities and assist them in successfully, safely and independently completing the entire sequence of showering tasks, such as properly washing their back, their upper parts, their lower limbs, their buttocks and groin, and to effectively use the towel for drying purposes. Adaptation and integration of state-of-the-art, cost-effective, soft-robotic arms will provide the hardware constituents, which, together with advanced human-robot force/compliance control will form the basis for a safe physical human-robot interaction that complies with the most up-to-date safety standards. Human behavioural, sociological, safety, ethical and acceptability aspects, as well as financial factors related to the proposed service robotics system will be thoroughly investigated and evaluated so that the I-SUPPORT end result is a close-to-market prototype, applicable to realistic living settings. |
Moustris, Papageorgiou G X S; Chalvatzaki, Pitsikalis G V; Dometios, A; Kardaris, N; Tzafestas, C S; Maragos, P User-Oriented Cognitive Interaction and Control for an Intelligent Robotic Walker Conference 17th International Conference on Social Robotics (ICSR 2015), 2015. Abstract | BibTeX | Tags: | Links: @conference{ICSR2015_2, title = {User-Oriented Cognitive Interaction and Control for an Intelligent Robotic Walker}, author = {G Papageorgiou X.S. Moustris and G Pitsikalis V. Chalvatzaki and A Dometios and N Kardaris and C S Tzafestas and P Maragos}, booktitle = {17th International Conference on Social Robotics (ICSR 2015)}, } Mobility impairments are prevalent in the elderly population and constitute one of the main causes related to difficulties in performing Activities of Daily Living (ADLs) and consequent reduction of quality of life. This paper reports current research work related to the control of an intelligent robotic rollator aiming to provide user-adaptive and context-aware walking assistance. To achieve such targets, a large spectrum of multimodal sensory processing and interactive control modules need to be developed and seamlessly integrated, that can, on one side track and analyse human motions and actions, in order to detect pathological situations and estimate user needs, while predicting at the same time the user (short-term or long-range) intentions in order to adapt robot control actions and supportive behaviours accordingly. User-oriented human-robot interaction and control refers to the functionalities that couple the motions, the actions and, in more general terms, the behaviours of the assistive robotic device to the user in a non-physical interaction context. |
Chalvatzaki, G; Papageorgiou, X S; Tzafestas, C S Gait Modelling for a Context-Aware User-Adaptive Robotic Assistant Platform Conference 2015, ISSN: 978-88-97999-63-8. Abstract | BibTeX | Tags: | Links: @conference{CPT15, title = {Gait Modelling for a Context-Aware User-Adaptive Robotic Assistant Platform}, author = {G Chalvatzaki and X S Papageorgiou and C S Tzafestas}, pages = {132-141}, } For a context-aware robotic assistant platform that follows patients with moderate mobility impairment and adapts its motion to the patient?s needs, the de- velopment of an efficient leg tracker and the recogni- tion of pathological gait are very important. In this work, we present the basic concept for the robot con- trol architecture and analyse three essential parts of the Adaptive Context-Aware Robot Control scheme; the detection and tracking of the subject?s legs, the gait modelling and classification and the computation of gait parameters for the impairment level assess- ment. We initially process raw laser data and estimate the legs? position and velocity with a Kalman Filter and then use this information as input for a Hidden Markov Model-based framework that detects specific gait patterns and classifies human gait into normal or pathological. We then compute gait parameters com- monly used for medical diagnosis. The recognised gait patterns along with the gait parameters will be used for the impairment level assessment, which will activate certain control assistive actions regarding the pathological state of the patient. |
Papageorgiou, X S; Chalvatzaki, G; Tzafestas, C S; Maragos, P Hidden markov modeling of human pathological gait using laser range finder for an assisted living intelligent robotic walker Conference IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015. Abstract | BibTeX | Tags: | Links: @conference{IROS2015, title = {Hidden markov modeling of human pathological gait using laser range finder for an assisted living intelligent robotic walker}, author = {X S Papageorgiou and G Chalvatzaki and C S Tzafestas and P Maragos}, booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, pages = {6342-6347}, } The precise analysis of a patient's or an elderly person's walking pattern is very important for an effective intelligent active mobility assistance robot. This walking pattern can be described by a cyclic motion, which can be modeled using the consecutive gait phases. In this paper, we present a completely non-invasive framework for analyzing and recognizing a pathological human walking gait pattern. Our framework utilizes a laser range finder sensor to detect and track the human legs, and an appropriately synthesized Hidden Markov Model (HMM) for state estimation, and recognition of the gait patterns. We demonstrate the applicability of this setup using real data, collected from an ensemble of different elderly persons with a number of pathologies. The results presented in this paper demonstrate that the proposed human data analysis scheme has the potential to provide the necessary methodological (modeling, inference, and learning) framework for a cognitive behavior-based robot control system. More specifically, the proposed framework has the potential to be used for the classification of specific walking pathologies, which is needed for the development of a context-aware robot mobility assistant. |
Koutras, P; Zlatintsi, A; E.Iosif, ; Katsamanis, A; Maragos, P; Potamianos, A Predicting Audio-Visual Salient Events Based on Visual, Audio and Text Modalities for Movie Summarization Conference Proc. {IEEE} Int'l Conf. Acous., Speech, and Signal Processing, Quebec, Canada, 2015. Abstract | BibTeX | Tags: affective text analysis, audio-visual salient events, auditory saliency, Movie summarization, Visual saliency | Links: @conference{KZI+15, title = {Predicting Audio-Visual Salient Events Based on Visual, Audio and Text Modalities for Movie Summarization}, author = {P Koutras and A Zlatintsi and E.Iosif and A Katsamanis and P Maragos and A Potamianos}, booktitle = {Proc. {IEEE} Int'l Conf. Acous., Speech, and Signal Processing}, } In this paper, we present a new and improved synergistic approach to the problem of audio-visual salient event detection and movie summarization based on visual, audio and text modalities. Spatio-temporal visual saliency is estimated through a perceptually inspired frontend based on 3D (space, time) Gabor filters and frame-wise features are extracted from the saliency volumes. For the auditory salient event detection we extract features based on Teager-Kaiser Energy Operator, while text analysis incorporates part-of-speech tag-ging and affective modeling of single words on the movie subtitles. For the evaluation of the proposed system, we employ an elementary and non-parametric classification technique like KNN. Detection results are reported on the MovSum database, using objective evaluations against ground-truth denoting the perceptually salient events, and human evaluations of the movie summaries. Our evaluation verifies the appropriateness of the proposed methods compared to our baseline system. Finally, our newly proposed summarization algorithm produces summaries that consist of salient and meaningful events, also improving the comprehension of the semantics. |
Zlatintsi, A; E.Iosif, ; Maragos, P; Potamianos, A Audio Salient Event Detection and Summarization using Audio and Text Modalities Conference Nice, France, 2015. Abstract | BibTeX | Tags: affective text analysis, Audio summarization, Audio-text salient events, Monomodal auditory saliency | Links: @conference{ZIM+15, title = {Audio Salient Event Detection and Summarization using Audio and Text Modalities}, author = {A Zlatintsi and E.Iosif and P Maragos and A Potamianos}, } This paper investigates the problem of audio event detection and summarization, building on previous work [1, 2] on the detection of perceptually important audio events based on saliency models. We take a synergistic approach to audio summarization where saliency computation of audio streams is assisted by using the text modality as well. Auditory saliency is assessed by auditory and perceptual cues such as Teager energy, loudness and roughness; all known to correlate with attention and human hearing. Text analysis incorporates part-of-speech tagging and affective modeling. A computational method for the automatic correction of the boundaries of the selected audio events is applied creating summaries that consist not only of salient but also meaningful and semantically coherent events. A non-parametric classification technique is employed and results are reported on the MovSum movie database using objective evaluations against ground-truth designating the auditory and semantically salient events. |
Zlatintsi, A; Koutras, P; Efthymiou, N; Maragos, P; Potamianos, A; Pastra, K Quality Evaluation of Computational Models for Movie Summarization Conference Costa Navarino, Messinia, Greece, 2015. Abstract | BibTeX | Tags: | Links: @conference{ZKE+15, title = {Quality Evaluation of Computational Models for Movie Summarization}, author = {A Zlatintsi and P Koutras and N Efthymiou and P Maragos and A Potamianos and K Pastra}, } In this paper we present a movie summarization system and we investigate what composes high quality movie summaries in terms of user experience evaluation. We propose state-of-the-art audio, visual and text techniques for the detection of perceptually salient events from movies. The evaluation of such computational models is usually based on the comparison of the similarity between the system-detected events and some ground-truth data. For this reason, we have developed the MovSum movie database, which includes sensory and semantic saliency annotation as well as cross-media relations, for objective evaluations. The automatically produced movie summaries were qualitatively evaluated, in an extensive human evaluation, in terms of informativeness and enjoyability accomplishing very high ratings up to 80% and 90%, respectively, which verifies the appropriateness of the proposed methods. |
Chaspari, Theodora; Soldatos, Constantin; Maragos, Petros The development of the Athens Emotional States Inventory (AESI): collection, validation and automatic processing of emotionally loaded sentences Journal Article The World Journal of Biological Psychiatry, 16 (5), pp. 312–322, 2015. BibTeX | Tags: | Links: @article{chaspari2015development, title = {The development of the Athens Emotional States Inventory (AESI): collection, validation and automatic processing of emotionally loaded sentences}, author = {Theodora Chaspari and Constantin Soldatos and Petros Maragos}, journal = {The World Journal of Biological Psychiatry}, volume = {16}, number = {5}, pages = {312--322}, publisher = {Taylor & Francis}, } |
Bampis, Christos G; Maragos, Petros UNIFYING THE RANDOM WALKER ALGORITHM AND THE SIR MODEL FOR GRAPH CLUSTERING AND IMAGE SEGMENTATION Conference Icip 2015, 2 (3), 2015, ISBN: 9781479983391. BibTeX | Tags: | Links: @conference{319, title = {UNIFYING THE RANDOM WALKER ALGORITHM AND THE SIR MODEL FOR GRAPH CLUSTERING AND IMAGE SEGMENTATION}, author = { Christos G Bampis and Petros Maragos}, booktitle = {Icip 2015}, volume = {2}, number = {3}, pages = {2265--2269}, } |
Giannoulis, Panagiotis; Brutti, Alessio; Matassoni, Marco; Abad, Alberto; Katsamanis, Athanasios; Matos, Miguel; Potamianos, Gerasimos; Maragos, Petros; Kessler, Fondazione Bruno MULTI-ROOM SPEECH ACTIVITY DETECTION USING A DISTRIBUTED MICROPHONE NETWORK IN DOMESTIC ENVIRONMENTS Conference Proc. European Signal Processing Conf. (EUSIPCO-2015), Nice, France, Sep. 2015, 2015, ISBN: 9780992862633. BibTeX | Tags: | Links: @conference{306, title = {MULTI-ROOM SPEECH ACTIVITY DETECTION USING A DISTRIBUTED MICROPHONE NETWORK IN DOMESTIC ENVIRONMENTS}, author = { Panagiotis Giannoulis and Alessio Brutti and Marco Matassoni and Alberto Abad and Athanasios Katsamanis and Miguel Matos and Gerasimos Potamianos and Petros Maragos and Fondazione Bruno Kessler}, booktitle = {Proc. European Signal Processing Conf. (EUSIPCO-2015), Nice, France, Sep. 2015}, pages = {1281--1285}, } |
Koutras, P; Zlatintsi, A; Iosif, E; Katsamanis, A; Maragos, P; Potamianos, A Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization Conference Proceedings - International Conference on Image Processing, ICIP, 2015-December , 2015, ISSN: 15224880. BibTeX | Tags: affective text analysis, audio-visual salient events, auditory saliency, Movie summarization, Visual saliency | Links: @conference{307, title = {Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization}, author = { P. Koutras and A. Zlatintsi and E. Iosif and A. Katsamanis and P. Maragos and A. Potamianos}, booktitle = {Proceedings - International Conference on Image Processing, ICIP}, volume = {2015-December}, pages = {4361--4365}, } |
Koutras, Petros; Maragos, Petros Estimation of eye gaze direction angles based on active appearance models Conference 2015 IEEE International Conference on Image Processing (ICIP), 2015, ISBN: 978-1-4799-8339-1. BibTeX | Tags: | Links: @conference{308, title = {Estimation of eye gaze direction angles based on active appearance models}, author = { Petros Koutras and Petros Maragos}, booktitle = {2015 IEEE International Conference on Image Processing (ICIP)}, pages = {2424--2428}, } |
Copyright Notice:
Some material presented is available for download to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
The work already published by the IEEE is under its copyright. Personal use of such material is permitted. However, permission to reprint/republish the material for advertising or promotional purposes, or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of the work in other works must be obtained from the IEEE.