SP Grand Challenge e-Prevention

SP Grand Challenge e-Prevention


Update ! We publicly release the SPGC e-Prevention data:

 Track 1: https://robotics.ntua.gr/wp-content/uploads/SPGC/SPGC_challenge_track_1_release.zip

Track 2: https://robotics.ntua.gr/wp-content/uploads/SPGC/SPGC_challenge_track_2_release.zip

Labels (For both tracks): https://robotics.ntua.gr/wp-content/uploads/SPGC/labels.zip

For all .zip files the username/password are:

user: SPGC1
password: x9Xjb#t*$^4H7p

Scope of the challenge

The e-Prevention: Person Identification and Relapse Detection from Continuous Recordings of Biosignals challenge has been accepted as a Signal Processing Grand Challenge (SPGC) of the ICASSP 2023 conferencePlease refer to https://2023.ieeeicassp.org/call-for-sp-grand-challenges/ for more details.

Abstract/Challenge Overview:

The challenge will concern the analysis and processing of long-term continuous recordings of biosignals recorded from wearable sensors embedded in smartwatches, in order to extract high-level representations of the wearer’s activity and behavior for two downstream tasks:

1) Identification of the wearer of the smartwatch, and

2) Detection of relapses in patients in the psychotic spectrum.

The tasks are of importance to the biomedical signal processing and psychiatry communities, since through the identification of digital phenotypes from wearable signals, useful insights on the distinctive behavioral patterns and relapse course of patients with psychiatric disorders can be derived, contributing to early symptom identification, and eventually better outcomes of the disorder.

Interested participants are invited to apply their approaches and methods on a large scale dataset acquired through the e-Prevention project (https://eprevention.gr/), including continuous measurements from accelerometers, gyroscopes and heart rate monitors, as well as information about the daily step count and sleep, collected from patients in the psychotic spectrum for a monitoring period of up to 2.5 years, and a control subgroup for a provisional period of 3 months.

Challenge duration: December 2022 – February 2023

Contacts

A. Zlatintsi, School of ECE (CVSP / IRAL Group), National Technical Univ. of Athens, nzlat@cs.ntua.gr

P. P. Filntisis, School of ECE (CVSP / IRAL Group), National Technical Univ. of Athens, filby@central.ntua.gr

Reference

Zlatintsi, A.; Filntisis, P.P.; Garoufis, C.; Efthymiou, N.; Maragos, P.; Menychtas, A.; Maglogiannis, I.; Tsanakas, P.; Sounapoglou, T.; Kalisperakis, E.; Karantinos, T.; Lazaridi, M.; Garyfalli, V.; Mantas, A.; Mantonakis, L.; Smyrnis, N. E-Prevention: Advanced Support System for Monitoring and Relapse Prevention in Patients with Psychotic Disorders Analyzing Long-Term Multimodal Data from Wearables and Video Captures. Sensors 2022, 22, 7544. https://doi.org/10.3390/s22197544

Funding: This research has been financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH–CREATE–INNOVATE (project acronym: e-Prevention, code: T1EDK-02890/MIS: 5032797).

Background & Task Overview:

Over the last 60 years, various studies of psychotic conditions (such as bipolar disorder and schizophrenia) have been conducted in neurobiology and neurophysiology; however, their causes still remain unknown. The consequence of this is that no effective biomarkers for either diagnosis or prediction of the course of psychotic symptomatology have yet been discovered; thus, now the utilization of such markers for timely diagnosis and prevention of psychotic relapses constitutes one of the most prominent study areas in psychiatry.

Nowadays, the broad adoption of wearable products, such as smartwatches and fitness trackers, has led to the emergence of the interdisciplinary field of digital phenotyping, which encompasses the in situ quantification of human behavior and traits (the “phenotype”) by utilizing the sensors included in these devices. Such wearables collect multimodal data, usually using accelerometers, gyroscopes and heart rate monitors among others, to measure the user’s physical activity and kinetic activity, such as micro-movements and autonomic function.

However, the public availability of large user-diverse datasets of physiological signals is scarce, especially in conjunction with mental health indicators. As a result, through this challenge, researchers in the field will have the opportunity to work on and draw insights from a large-scale collection of raw biosignals from both a group of patients in the psychotic spectrum and a group of healthy controls, in two different tasks: Studying the correlation of the raw signals to user-specific behavioral patterns via person identification from the recorded signals, and using them as biomarkers of psychotic symptomatology through the detection of relapsing states in psychotic patients.

Data Collection / Setup

During the course of the e-Prevention project, a total of 60 people (37 patients in the psychotic spectrum and 23 healthy controls) were recruited at the University Mental Health, Neurosciences and Precision Medicine Research Institute “Costas Stefanis’’ (UMHRI) in Greece, and the protocol of the project was approved by the Ethics Committee of the Institution. All participants were provided with a Samsung Gear S3 smartwatch that monitored the user’s linear and angular acceleration (m/s2 and deg/s2, sampled at 20Hz), heart rate variability and RR intervals (sampled at 5Hz), sleeping schedule, and steps. This information was continuously collected from the patients for a monitoring period of up to 2.5 years, while the same data were collected from the control subgroup for a provisional period of 3 months. The collected data were anonymized, and each participant in the study was assigned a unique ID as an identifier. The clinicians annotated patients’ relapse periods according to their monthly assessments and communication with the attending physician or the family. Overall, the resulting dataset contains a total of approximately 20000 human-days of collected data spread among all participants.

Institutional Review Board Statement: All volunteer subjects, involved in the study, gave their written informed consent and permission for inclusion and use of their anonymized data before they participated in the study. The study was conducted in accordance with the provisions of the General Regulation (EU) 2016/679, and the protocol was approved by the Ethics Committee of the University Mental Health, Neurosciences and Precision Medicine Research Institute “Costas Stefanis” (UMHRI) in Athens, Greece. (Project identification acronym: e-Prevention, code: T1EDK-02890/MIS: 5032797).

Track Description:

The challenge contains two tracks, corresponding to two different downstream tasks. In particular:

First Track – Person Identification: The goal in this task is to identify the watch wearer by forming and classifying their digital phenotypes from the recorded biosignals.

Second Track – Relapse Detection: In this task, we want to detect the appearance of relapses in the patients, based on the smartwatch measurements.

Participating teams are allowed to compete on either, or both, tracks. More details about each track can be found in their respective subpage.

General Instructions

For the purposes of this challenge, we will provide two subsets of this dataset, one for each challenge track. For each track, the provided data will be split into training, validation and testing data; the testing set does not contain annotations since the participants will evaluate their algorithms on this set. The data are stored into folders, each containing the raw sensor signals and the sleeping and walking data corresponding to each patient for a particular time interval.

Person Identification Track

Data Format: We provide a stratified split of the complete dataset (both patients and controls), consisting of about two and a half months per person. The training and validation splits contain the raw sensor recordings, sleep and walking information, as well as the unique ID corresponding to the identity of the respective watch wearer as the ground truth. The testing data consist solely of the raw recordings and the sleep/walk information.

Data Structure:

We provide 2 folders, one named training_data and one named test_data. The directory structure is the following:

training_data # data for training
├── user_00 # contains recorded data of user 00
│   ├── train # data to be used for training
│   │   ├── 00 # one day of recorded data
│   │   │     ├── data.csv # biometric/behavior data recorded during the day
│   │   │     ├──step.csv # step data recorded during the day
│   │   ├── 01 # another day of recorded data
│   │   │     ├── data.csv
│   │   │     ├──step.csv
…..
│   ├── val # data to be used for validation – same structure as train

├── user_01 # contains recorded data of user 01
├── user_02 # contains recorded data of user 02

├── user_45 # contains recorded data of user 45

test_data # data for testing
├── 00 # one day of recorded data
│     ├── data.csv # biometric/behavior data recorded during the day
│     ├──step.csv # step data recorded during the day
…..
├── 520 # one day of recorded data
│     ├── data.csv # biometric/behavior data recorded during the day
│     ├──step.csv # step data recorded during the day


data.csv
acc_X, acc_Y, acc_Z # linear acceleration from the accelerometer (Valid range: [-19.6, 19,6])

gyr_X, gyr_Y, gyr_Z # angular velocity from the gyroscope (Valid range: [-573, 573])

heartRate, rRInterval # beats-per-minute and R-R interval from PPG. Valid values are > 0.

timecol # timestamp inside the day

sleeping # 0 if the user is awake, 1 if they are sleeping

Note that for the above data we provide the mean value per 5 seconds. i.e., the time column has increments of 5 seconds.

step.csv

start_time – end_time # duration while the user was walking

totalSteps # number of steps done

stepsWalking # number of walking steps

stepsRunning # number of running steps

distance # distance covered

calories # calories burned


Submission

You are required to send us a .csv file with two columns. One is the number of the day in the test data and the second is the id of the user. i.e.:

day, user
00, 05
01, 14

520, 34

You are allowed to send us multiple submissions (up to 5), however from each team we will evaluate only the last one sent.

Each submission must be accompanied with a short (up to 1 page) description of the proposed system and methodology.

Submission Deadline: 1st February 2023 (11:59 PM GMT)

Evaluation Metrics: Accuracy Score (raw)

Baseline

For the baseline of the first track we trained a Deep 1D CNN with 5 convolutional layers, including Batch Normalization and ReLU activations. After the last BN layer we use AdaptiveAvgPooling and a final fully connected layer to predict the logits for the 46 identities

Data Preprocessing

We normalized the accelerometer and gyroscope values into the 0-1 interval using their provided valid range above. For the heart rate we used a max of 255, while for rRIntervals a max of 2000.

Training

During training, we sample a random contiguous segment of the day of 3H, provided that it included at least 2.5H of valid data. Then, we impute any missing timestamps in the segment using the nearest neighbour, resulting in a sequence of 721 x 8 features (we did not use sleeping/timecol or step information). We fed the segment through the network, predicted the ID of the user and used Cross Entropy Loss for training. We train for 300 epochs with a batch size of 64, using Adam with initial learning rate of 1e-4 and reducing it by a factor of 10 at 150 and 225 epochs.

During validation, we select all contiguous 3H segments of the day with at least 1H of valid data, impute them again with nearest neighbors, and use voting over all segments in order to select the final predicted user ID of the day.

The final validation score using this method is 62%.


Leaderboard

TEAMAccuracy
SRCB-LUL95.00%
PeRCeiVe93.85%
AI_Bezzie91.36%
SAILers82.15%
unipi_cmbl75.43%
ADCADD3.83%
ID-EPRE22.88%
eHust2.88%
CogBCI2.88%
uoi2.68%
NWPU2.3%

Relapse Detection Track

Data Format: We provide a subset split of the complete ePrevention dataset consisting of about six_months per person. The participants are expected to detect the relapses as anomaliesi.e., we supply for each patient data from his non-relapsed state for training. We also supply for validation a subset from both his normal state and relapsed state in the (with each day labeled as relapse, non-relapse). Finally, the test set includes another subset of days which include data from both relapse, non-relapse periods without labels.

Data Structure:

We provide 2 folders, one named training_data and one named test_data. The directory structure is the following:

training_data # data for training
├── user_00 # contains all recorded data of user 00
│   ├── train # data to be used for training
│   │   ├── non-relapse # data when the patient was in remission
│   │   │     ├── 00 # one day of recorded data
│   │   │     │    ├── data.csv # biometric/behavior data recorded during the day
│   │   │     │    ├── step.csv # step data recorded during the day
│   ├── val # data to be used for validation – same structure as train
│   │   ├── non-relapse # data when the patient was in remission 
│   │   │     ├── 00 # one day of recorded data
│   │   │     │    ├── data.csv # biometric/behavior data recorded during the day
│   │   │     │    ├── step.csv # step data recorded during the day
│   │   ├── relapse # data when the patient was in relapse – same structure as normal
│   │   │     ├── 00 # one day of recorded data
│   │   │     │    ├── data.csv # biometric/behavior data recorded during the day
│   │   │     │    ├── step.csv # step data recorded during the day

├── user_09 # contains all recorded data of user 09

test_data # data for training

├── user_00 # contains test data for user 00
│   ├── test
│   │    ├── 00 # one day of recorded data
│   │    │  ├── data.csv # biometric/behavior data recorded during the day
│   │    │  ├── step.csv # step data recorded during the day

│   │    ├── 01 # another day of recorded data
│   │    │  ├── data.csv # biometric/behavior data recorded during the day
│   │    │  ├── step.csv # step data recorded during the day

├── user_09 # contains test data of user 09


data.csv
acc_X, acc_Y, acc_Z # linear acceleration from the accelerometer (Valid range: [-19.6, 19,6])
gyr_X, gyr_Y, gyr_Z # angular velocity from the gyroscope (Valid range: [-573, 573])
heartRate, rRInterval # beats-per-minute and R-R interval from PPG. Valid values are > 0.

timecol # timestamp inside the day

sleeping # 0 if the user is awake, 1 if they are sleeping

Note that for the above data we provide the mean value per 5 seconds. i.e., the time column has increments of 5 seconds.

step.csv

start_time – end_time # duration while the user was walking

totalSteps # number of steps done

stepsWalking # number of walking steps

stepsRunning # number of running steps

distance # distance covered

calories # calories burned


Submission

You are required to send us a .csv file with three columns. One is the user ID, the second is the number of the day in the test data of the user and the third is your prediction score on whether the user is in non-relapse (remission) or in relapse (values closer to 0 denote non-relapse) . i.e.:

user, day, status
00, 00, 0.23
00, 01, 0.49
00, 02, 0.55
….
09, 00, 0.72
09, 01, 0.38
09, 02, 0.02

You are allowed to send us multiple submissions (up to 5), however from each team we will evaluate only the last one sent.

Each submission must be accompanied with a short (up to 1 page) description of the proposed system and methodology.

Submission Deadline: 1st February 2023 (11:59 PM GMT)

Evaluation Metrics: The PR-AUC and ROC-AUC scores over the daily predictions will be utilized as the evaluation metrics. To obtain a unified metric, the harmonic mean of the PR-AUC and ROC-AUC scores will be used.

Note: You are allowed to use the validation data only in a label agnostic way. We will only accept unsupervised methods (regarding the patient condition non-relapse/relapse).


Baseline
The baseline provided for the second track is based on an 1-layer linear autoencoder. A total of 10 features were extracted from 5-minute slices of the original data. In more detail, the mean norm of the linear (accelerometer measurements) and radial (gyroscope measurements) were computed to quantify movements and micro-movements, while cardiac behavior was estimated by the mean heart rate (bpm), the mean RR interval (msec), as well as the major axis of the Poincare ellipse and the normalized low- and high- frequency powers of the Lomb-Scargle periodogram as computed from the NNI series. Finally, daily sinusoidal encoding (sine and cosine values) was used for the temporal encoding of the timestamps, while the percentage of valid 5-sec measurements was also calculated. Missing features were handled by median interpolation for intervals up to 3 hours; larger intervals of missing values were discarded. These features were then stacked into 2D tensors of size 48×10 (with each row representing a 5-minute slice and each column a feature), thus covering 4 hours each, using a stride of 1 hour.

Then, an autoencoder was trained, with a bottleneck dimension equal to N=60 and a LeakyReLU activation, using as input the feature representations extracted above, after being standardized per- patient and flattened into a 480-D vector. The post-normalization statistical properties (i.e., the mean and covariance matrix) of each (5-min) feature slice in the training set were used to compute a multivariate normal distribution, that the feature vectors follow. During inference, input tensors corresponding to 4-hour intervals are standardized according to the precomputed per-patient transform, and fed into the autoencoder. The per-feature mean of the autoencoder output was then computed, and its distance to the assumed feature distribution is computed and used as an anomaly score; input tensors corresponding to relapsing periods are expected to record higher anomaly scores. Since the evaluation is carried out in a per-day basis, the median anomaly score over all the 4-hour tensors corresponding to each day is computed to obtain a single anomaly score for each day. Application of the above methodology in the provided validation set yields a PR-AUC score of 0.635 and a ROC-AUC score of 0.578.


Leaderboard

TEAMTest ROC-AUCTest PR-AUCTotal
PeRCeiVe0.64690.65090.6489
Emotion0.60720.63470.6209
SAILers0.58390.62630.6051
SmartBCI0.54350.58630.5604
YDH@HEU0.52150.55870.5401
GIPS@HEU0.51170.54800.5229

The challenge starts on December 8, 2022 and will end at 11:59 PM GMT on February 1st, 2023.

Program timeline

  • November 28th, 2022 : Registration opens

  • December 8th, 2022 : Dataset Release and starting date

  • February 1st, 2023 : Deadline for participants to submit their results

  • February 6th, 2023: Notification of the final results

  • February 20th, 2023: Deadline for invited paper submission

  • March 7th, 2023 : ICASSP 2023 SPGC acceptance notification

  • March 14th, 2023 : ICASSP 2023 SPGC camera-ready papers due and challenge report submission

Organization

Registration procedure

To register for the challenge, participants are required to send an e-mail to the below contacts with the team name, the names of their team members, as well as their emails and affiliations.

Contacts

A. Zlatintsi, School of ECE (CVSP / IRAL Group), National Technical Univ. of Athens, nzlat@cs.ntua.gr

P. P. Filntisis, School of ECE (CVSP / IRAL Group), National Technical Univ. of Athens, filby@central.ntua.gr

The goal of the challenge is to foster research on machine learning for biosignals. All participants should adhere to the following rules to be eligible for the challenge:

  • All participants must submit the obtained results for at least one of the 2 tasks, accompanied with a short (up to 1 page) description of their proposed system and methodology.

  • Participating teams are allowed to update their submissions and their scores multiple (up to 5) times during the evaluation phase.

  • Each individual participant cannot be included in multiple participating teams.

  • Αfter the completion of the challenge, the top scoring teams for each track will be declared as the winners of their respective track. Furthermore, the top-5 performing teams (which will be chosen from the 2 tracks depending on the distribution of the participants in each track) will be required and invited to provide a synopsis of their proposed methodology and results in a two-page-long paper, and present it in-person to the Special Session dedicated to this challenge in the ICASSP-2023 conference. The format of the submitted papers should be consistent with the one of ICASSP regular papers, and should be submitted before the camera-ready deadline (see Timeline for more details).

  • Participants can only publish their own results regarding the two Challenge Tracks. A summary of the challenge results will also be prepared by the organizers.

  • There are no restrictions on the proposed methodologies, as well as the usage of external datasets. However, in case of a tie, the Challenge Committee will take into account the novelty and originality of the proposed approach.

  • The intellectual property (IP) of all shared/submitted code remains to the participants and is not transferred to the challenge organizers. If the code developed by the participants is made publicly available, an appropriate license should be added.

Permission is granted to use the data given that you agree:

1. To include a reference to the e-Prevention 2022 Dataset in any work that makes use of the dataset. For research papers, cite our preferred publication as listed below and our challenge overview paper (released later); for other media cite our preferred publication as listed on our website.

2. That you do not distribute this dataset or modified versions.

3. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.

4. That all rights not expressly granted to you are reserved by the e-Prevention SP Grand Challenge 2022 organizers.

Preferred publications

Full e-Prevention System description

Zlatintsi, A., Filntisis, P. P., Garoufis, C., Efthymiou, N., Maragos, P., Menychtas, A., Maglogiannis, I., Tsanakas, P., Sounapoglou, T., Kalisperakis, E., Karantinos, T., Lazaridi, M., Garyfali, V., Mantas, A., Mantonakis L. and Smyrnis, N. E-Prevention: Advanced Support System for Monitoring and Relapse Prevention in Patients with Psychotic Disorders Analyzing Long-Term Multimodal Data from Wearables and Video Captures. Sensors, 22(19), 7544, 2022.

Please cite this paper:

@article{zlatintsi2022prevention,

Title = {E-Prevention: Advanced Support System for Monitoring and Relapse Prevention in Patients with Psychotic Disorders Analyzing Long-Term Multimodal Data from Wearables and Video Captures},

Author = {Zlatintsi, Athanasia and Filntisis, Panagiotis P and Garoufis, Christos and Efthymiou, Niki and Maragos, Petros and Menychtas, Andreas and Maglogiannis, Ilias and Tsanakas, Panayiotis and Sounapoglou, Thomas and Kalisperakis, Emmanouil and others},

Journal = {Sensors},

Volume = {22},

Number = {19},

Pages = {7544},

Year = {2022},

Publisher = {MDPI}

}

Task 1

[2] Retsinas, G., Filntisis, P. P., Efthymiou, N., Theodosis, E., Zlatintsi, A., & Maragos, P. Person identification using deep convolutional neural networks on short-term signals from wearable sensors. In Proc. ICASSP 2020, online, 2020.

Please cite this paper:

@inproceedings{retsinas2020person,

Title = {Person identification using deep convolutional neural networks on short-term signals from wearable sensors},

Author = {Retsinas, George and Filntisis, Panayiotis Paraskevas and Efthymiou, Niki and Theodosis, Emmanouil and Zlatintsi, Athanasia and Maragos, Petros},

Booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},

Pages = {3657–3661},

Year = {2020},

organization={IEEE}

}

Task 2

[3] Panagiotou, M.; Zlatintsi, A.; Filntisis, P.P.; Roumeliotis, A.J.; Efthymiou, N.; Maragos, P. A comparative study of autoencoder architectures for mental health analysis using wearable sensors data. In Proc. EUSIPCO, Belgrade, Serbia, 2022.

Please cite this paper:

@inproceedings{panagiotou2022comparative,

Title = {A comparative study of autoencoder architectures for mental health analysis using wearable sensors data},

Author = {Panagiotou, M and Zlatintsi, A and Filntisis, PP and Roumeliotis, AJ and Efthymiou, N and Maragos, P},

Booktitle = {30th European Signal Processing Conference (EUSIPCO)},

Pages = {1258–1262},

Year = {2022},

organization={IEEE}

}

A. Zlatintsi1, P. P. Filntisis1, N. Efthymiou1, C. Garoufis1, G. Retsinas1, T. Sounapoglou2, I. Maglogiannis4, P. Tsanakas1, N. Smyrnis3, and P. Maragos1

1 School of ECE (CVSP / IRAL Group), National Technical University of Athens, Athens, Greece

2 BLOCKACHAIN PC, Thessaloniki, Greece

3 National & Kapodistrian University of Athens, Medical School, Athens, Greece

4 Department of Digital Systems, University of Piraeus, 185 34 Pireas, Greece

2023-09-05T14:41:29+00:00