Dataset Description¶
To obtain the data, please follow the instructions under this link. After approval of your request, you will be granted access to the Data Download page to download the data.
Dataset Structure¶
hecktor2025_training/ ├── imagesTr ├── CHUM-001__CT.nii.gz ├── CHUM-001__PT.nii.gz ├── CHUM-001__dosimetry_CT.nii.gz* ├── CHUM-001__radiotherapy_dosemaps.nii.gz* └── ... ├── labelsTr ├── CHUM_001.nii.gz └── ... ├── hecktor2025_clinical_info_training.csv └── hecktor2025_endpoint_training.csv
*Radiotherapy planning dose map and Dosimetry CT will be available for a subset of the dataset only.
All the PET/CT images are gathered inside the imagesTr
folder. The name convention is CenterName_PatientID__Modality.nii.gz
. The primary tumor (GTVp) and lymph nodes (GTVn) segmentations are inside the labelsTr
folder and are contained within one .nii.gz file per patient. The code label 1 is attributed to the GTVp and the label 2 for GTVn.
The new folder doseTr contains the dose maps and associated dosimetry CT for a subset of patients (approximately 650 cases). These can be used for Task 2 (RFS prediction).
The clinical information for each patient is contained in the hecktor2025_clinical_info_training.csv
, including center, gender, age, weight, tobacco and alcohol consumption, performance status (Zubrod), HPV status, treatment (surgery and/or chemotherapy in addition to the radiotherapy that all patients underwent). Note that some information may be missing for some patients, although an effort has been made to update and complete this information as much as possible for the 2025 edition. The survival events and times between the end of radiotherapy and the events or last follow-up (in days) are provided in hecktor2025_patient_endpoint_training.csv.
For Task 3, the HPV status is included in the clinical information file.
Validation and Testing Process¶
For the HECKTOR 2025 challenge, we have implemented a new evaluation approach. No test data will be shared directly with participants. Instead, evaluation will be conducted exclusively through Docker container submissions on the Grand Challenge platform.
Dataset Description¶
Patients with histologically proven oropharyngeal H&N cancer who underwent radiotherapy and/or chemotherapy treatment planning were considered.
The data originates from FDG-PET and low-dose non-contrast-enhanced CT images (acquired with combined PET/CT scanners) of the H&N region.
Data were collected from 13 centers :
Center |
Acronym |
PET/CT scanner |
HECKTOR 2022 |
||
Hôpital général juif, Montréal, CA |
HGJ |
Discovery ST, GE Healthcare |
Yes |
||
Centre hospitalier universitaire de Sherbooke, Sherbrooke, CA |
CHUS |
GeminiGXL 16, Philips |
Yes |
||
Hôpital Maisonneuve-Rosemont, Montréal, CA |
HMR |
Discovery STE, GE Healthcare |
Yes |
||
Centre hospitalier de l’Université de Montréal, Montréal, CA |
CHUM |
Discovery STE, GE Healthcare |
Yes |
||
Centre Hospitalier Universitaire Vaudois, CH |
CHUV |
Discovery D690 TOF, GE Healthcare |
Yes |
||
Centre Hospitalier Universitaire de Poitiers, FR |
CHUP |
Biograph mCT 40 ToF, Siemens |
Yes |
||
MD Anderson Cancer Center, Houston, Texas, USA |
MDA |
Discovery HR, Discovery RX, Discovery ST, Discovery STE (GE Healthcare) |
Yes |
||
UniversitätsSpital Zürich, CH |
USZ |
Discovery HR, Discovery RX, Discovery STE, Discovery LS, Discovery 690 (GE Healthcare) |
Yes |
||
Centre Henri Becquerel, Rouen, FR |
CHB |
GE710, GE Healthcare |
Yes |
||
Hôpitaux Universitaires de Genève, CH |
HUG |
Siemens Biograph 64 True Point scanner |
No |
||
Centre Hospitalier Universitaire de Brest, FR |
CHUB |
Philips GEMINI, Siemens Biograph, Siemens Biograph Vision |
No |
||
Centre Hospitalier Universitaire de Nantes, FR |
CHUN |
Siemens mCT 64 vision |
No |
||
Groupe d'Oncologie Radiothérapie Tête Et Cou, FR |
GORTEC |
Multiple hybrid PET/CT scanner devices |
No |
The information on image data includes clinical center, scanner information, DICOM meta-data including acquisition parameters and reconstruction algorithms. For Task 2, additional meta-data information for the dosimetry CT and dosemaps is provided. The patient information includes center, age, gender, tobacco and alcohol consumption, performance status, HPV status, treatment (radiotherapy only or additional chemotherapy and/or surgery), and M stage. T and N stage will not be provided as it informs on lymph nodes status which is part of the goal of Task 1. HPV status will be provided for the training set but not the testing set (since it will be the ground-truth of Task 3). There may be missing values for some patients, although an effort has been made to update this information as much as possible. Training and testing cases represent one 3D FDG-PET volume registered with a 3D CT volume of the head and neck region. For Task 1, contours with the annotated ground truth lesions (only available for training cases to the participating teams) are provided. The labels have three values: background with the value 0, primary Gross Tumor Volumes (GTVp) with the value 1, and nodal Gross Tumor Volumes (GTVn) with the value 2 (in case of several lymph nodes, they are considered all with the same label). For Task 2, the cases also include the patient outcome information (only available for training cases to the participating teams) of RFS (time-to-event in days and censoring), as well as the dosimetry CT and the corresponding radiotherapy dosemap for some of the patients.
The total number of cases is more than 1500 from at least 13 centers. The total number of training cases is approximately 883 from 9 different centers. The test cases of the 2022 challenge were moved to the 2025 training set. The total number of test cases is approximately 400 from at least 3 centers, consisting of new and previously unseen cases. The test set is estimated to consist of 80% HPV-positive cases and 20% HPV-negative cases.
Training and test cohorts are representative of the distribution of the real-world population of patients accepted for initial staging of oropharyngeal cancer.
The preprocessing of PET/CT images involves (for both the training and test cases): (i) computation of the Standardized Uptake Value (SUV) for the PET images and (ii) conversion of the DICOM file format to NIfTI format.