Trainers

Specialized trainers for end-to-end Datamint workflows.

class datamint.lightning.trainers.BaseTrainer(dataset=None, project=None, *, dataset_kwargs=None, model=None, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, **kwargs)

Bases: ABC

Abstract base trainer encapsulating an end-to-end training workflow.

Subclasses provide task-specific defaults for model architecture, transforms, loss, and metrics by overriding the _build_* / _default_* hooks. Users typically only need to specify a project (or dataset) and optionally override a few settings.

Parameters:

dataset (DatamintBaseDataset | None) – A pre-built DatamintBaseDataset. Mutually exclusive with project.
project (str | Project | None) – Project name or Project object used to auto-build a dataset when dataset is None.
model (LightningModule | type[LightningModule] | None) – A user-provided LightningModule. When None the trainer builds a default one via _build_model().
loss_fn (Module | None) – Custom loss function forwarded to the default model. Ignored when model is provided (the user’s module owns its own loss).
batch_size (int) – Training batch size.
num_workers (int) – DataLoader workers.
train_transform (BaseCompose | None) – Albumentations transform for training. When None the trainer uses _train_transform().
eval_transform (BaseCompose | None) – Albumentations transform for val/test. When None the trainer uses _eval_transform().
split_as_of_timestamp (str | None) – Historical timestamp used to resolve project-scoped dataset splits during training. When omitted, the resolved project split datasets capture the current UTC timestamp and training lineage logs it via MLflow.
max_epochs (int) – Maximum number of training epochs.
early_stopping_patience (int | None) – Epochs without improvement before stopping. Set to None to disable early stopping.
mlflow_experiment_name (str | None) – MLflow experiment name. Auto-generated from the project name when None.
model_name (str | None) – Name for the model in the registry. Auto-generated when None.
auto_deploy_adapter (bool) – When True, auto-generate a DatamintModel adapter after training.
trainer_kwargs (dict[str, Any] | None) – Extra keyword arguments forwarded to lightning.Trainer.
dataset_kwargs (dict[str, Any] | None)
kwargs (Any)

property datamodule: DatamintDataModule

property dataset: DatamintBaseDataset

property experiment_name: str

fit()

Run the full training pipeline.

Return type:: dict[str, Any]
Returns:: Dictionary with keys 'trainer', 'model', 'test_results', and 'adapter' (when auto_deploy_adapter is enabled).

property model: LightningModule

test(register_model=True)

Run evaluation on the test split in a fresh run.

Parameters:: register_model (bool) – When True, run a zero-epoch fit first so the checkpoint callback saves the current model to MLflow and registers it after test metrics are logged.
Return type:: list[Mapping[str, float]]

class datamint.lightning.trainers.ClassificationTrainer(dataset=None, project=None, *, dataset_kwargs=None, model=None, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, **kwargs)

Bases: BaseTrainer

Abstract trainer for classification tasks.

Provides shared defaults:

Loss – CrossEntropyLoss.
Metrics – Multiclass Accuracy and macro F1 (torchmetrics).
Monitor – val/accuracy (maximise).

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
dataset_kwargs (dict[str, Any] | None)
model (LightningModule | type[LightningModule] | None)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
kwargs (Any)

class datamint.lightning.trainers.DeepLabV3PlusTrainer(dataset=None, project=None, *, image_size=None, slice_axis=None, model=None, in_channels=3, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, dataset_kwargs=None, encoder_name='resnet34', decoder_atrous_rates=(12, 24, 36), **kwargs)

Bases: SemanticSegmentation2DTrainer

Convenience trainer pre-configured for DeepLab v3+.

Uses the ASPP-based DeepLab v3+ architecture from segmentation_models_pytorch. The decoder_atrous_rates parameter controls the dilation rates of the Atrous Spatial Pyramid Pooling module, which is DeepLab v3+’s core multi-scale context mechanism.

Example:

trainer = DeepLabV3PlusTrainer(
    project='BUS_Segmentation',
    encoder_name='resnet50',
)
results = trainer.fit()

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
image_size (int | tuple[int, int] | None)
slice_axis (Literal['axial', 'sagittal', 'coronal'] | int | None)
model (LightningModule | type[LightningModule] | None)
in_channels (int)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
dataset_kwargs (dict[str, Any] | None)
encoder_name (str)
decoder_atrous_rates (tuple[int, int, int])
kwargs (Any)

class datamint.lightning.trainers.DetectionTrainer(dataset=None, project=None, *, dataset_kwargs=None, model=None, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, **kwargs)

Bases: BaseTrainer

Abstract trainer for object detection tasks.

Provides shared defaults for all detection models:

Dataset – ImageDataset with return_boxes=True
Collate – detection_collate_fn() (variable-length boxes)
Metrics – Mean Average Precision (torchmetrics)
Monitor – val/map (maximise)

Subclasses must implement _build_model(), _train_transform(), and _eval_transform().

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
dataset_kwargs (dict[str, Any] | None)
model (LightningModule | type[LightningModule] | None)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
kwargs (Any)

class datamint.lightning.trainers.EfficientNetV2Trainer(*, architecture='efficientnetv2_s', image_size=384, **kwargs)

Bases: ImageClassificationTrainer

Trainer pre-configured for EfficientNetV2.

Default model: EfficientNetV2-S pretrained on ImageNet at 384×384.

Parameters:

architecture (str) – timm model name. Defaults to 'efficientnetv2_s'. Other valid choices: 'efficientnetv2_m', 'efficientnetv2_l', 'efficientnetv2_xl', 'efficientnetv2_rw_t'.
image_size (int | tuple[int, int]) – Target image size (H, W) or a single int for square images. Defaults to 384, the resolution EfficientNetV2-S was trained at.

Example:

trainer = EfficientNetV2Trainer(project='ChestXray')
results = trainer.fit()

Parameters:: kwargs (Any)

class datamint.lightning.trainers.ImageClassificationTrainer(*, architecture='resnet34', pretrained=True, image_size=None, **kwargs)

Bases: ClassificationTrainer

Trainer for image classification tasks.

Default model: ResNet-34 (via timm) pretrained on ImageNet.

Parameters:

architecture (str) – timm model name. Defaults to 'resnet34'.
pretrained (bool) – Use pretrained weights. Defaults to True.
image_size (int | tuple[int, int] | None) – Optional target image size (H, W) or a single int for square images. When omitted, the trainer keeps the original image size instead of forcing a resize.

Example:

trainer = ImageClassificationTrainer(project='ChestXray')
results = trainer.fit()

Parameters:: kwargs (Any)

class datamint.lightning.trainers.NNUNetTrainer(dataset=None, project=None, *, configuration='3d_fullres', fold=0, dataset_id=None, nnunet_work_dir=None, continue_training=False, channel_names=None, num_processes_preprocessing=None, max_epochs=1000, **kwargs)

Bases: BaseTrainer

Datamint trainer that runs nnUNet v2 as the training backend.

Exports the project data to nnUNet Task format, runs fingerprinting, planning, preprocessing, and training via _DatamintNNUNetTrainer, then imports predictions back as Datamint annotations.

Parameters:

project (str | Project | None) – Datamint project name or object.
configuration (str) – nnUNet configuration — '2d', '3d_fullres' (default), '3d_lowres', or '3d_cascade_fullres'.
fold (int | str) – Cross-validation fold index (0–4) or 'all'.
dataset_id (int | None) – Fixed nnUNet dataset ID (1–999). When None the ID is auto-assigned from the registry.
nnunet_work_dir (Path | str | None) – Root directory for all nnUNet I/O (nnUNet_raw, nnUNet_preprocessed, nnUNet_results are created beneath it). Defaults to ~/.cache/datamint/nnunet/.
continue_training (bool) – Resume from an existing checkpoint.
channel_names (dict[str, str] | None) – Modality index → name mapping passed to dataset.json, e.g. {'0': 'CT'}.
num_processes_preprocessing (int | None) – Workers for nnUNet’s preprocessing step. None lets nnUNet choose.
max_epochs (int) – Training epochs forwarded to nnUNetTrainer.
kwargs (Any)

fit()

Run the full nnUNet training pipeline.

Steps in order:

Assign a stable nnUNet dataset ID for this project.
Set the three nnUNet_* environment variables.
Export all project resources to nnUNet Task format on disk.
Run dataset fingerprinting and experiment planning.
Run preprocessing for the configured nnUNet configuration.
Start an MLflow run.
Instantiate _DatamintNNUNetTrainer and call run_training().
Run inference on the test split (_run_prediction).
Import predictions as Datamint annotations (_import_predictions).

Return type:: dict
Returns:: {'bridge': bridge} — the trained bridge instance.

class datamint.lightning.trainers.SegmentationTrainer(dataset=None, project=None, *, dataset_kwargs=None, model=None, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, **kwargs)

Bases: BaseTrainer

Abstract trainer for segmentation tasks.

Provides shared defaults:

Loss – combined BCE + Dice (_BCEDiceLoss).
Metrics – Mean IoU and Generalised Dice Score (torchmetrics).
Monitor – val/iou (maximise).

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
dataset_kwargs (dict[str, Any] | None)
model (LightningModule | type[LightningModule] | None)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
kwargs (Any)

class datamint.lightning.trainers.SemanticSegmentation2DTrainer(*, image_size=None, slice_axis=None, model=None, in_channels=3, trainer_kwargs=None, **kwargs)

Bases: SegmentationTrainer

Trainer for 2-D semantic segmentation.

Default model: UNet++ (segmentation_models_pytorch) with a resnet34 encoder pretrained on ImageNet.

When pointed at a project made of 3-D volumes, the trainer automatically converts it to a SlicedVolumeDataset and trains on 2-D slices instead.

Parameters:

slice_axis (Literal['axial', 'sagittal', 'coronal'] | int | None) – Slice axis override for 3-D volume projects. When omitted, the trainer tries to infer the most sensible anatomical plane and falls back to 'axial'.
image_size (int | tuple[int, int] | None) – Target image size (H, W) or a single int for square images. Forwarded to default transforms. When None a sensible default is chosen.
in_channels (int) – Number of input image channels. Defaults to 3.
to (All remaining keyword arguments are forwarded)

:param BaseTrainer.:

Example:

trainer = SemanticSegmentation2DTrainer(project='BUS_Segmentation')
results = trainer.fit()

Parameters:

model (LightningModule | type[LightningModule] | None)
trainer_kwargs (dict[str, Any] | None)
kwargs (Any)

class datamint.lightning.trainers.SemanticSegmentation3DTrainer(*, slice_axis='axial', encoder_name='resnet34', in_channels=3, image_size=None, **kwargs)

Bases: SegmentationTrainer

Trainer for 3-D semantic segmentation via per-slice 2-D training.

Builds a VolumeDataset, slices it along the chosen axis, and trains a 2-D segmentation model on individual slices.

Parameters:

slice_axis (str | int) – Slicing axis — 'axial', 'sagittal', 'coronal', or an integer axis index.
encoder_name (str) – SMP encoder backbone.
in_channels (int) – Number of input channels.
image_size (int | tuple[int, int] | None) – Optional target image size (H, W) or a single int for square images. When omitted, training keeps the original slice size.

Example:

trainer = SemanticSegmentation3DTrainer(
    project='CT_Liver',
    slice_axis='axial',
)
results = trainer.fit()

Parameters:: kwargs (Any)

class datamint.lightning.trainers.TransUNetTrainer(dataset=None, project=None, *, image_size=None, slice_axis=None, model=None, in_channels=3, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, dataset_kwargs=None, variant='R50-ViT-B_16', pretrained=True, **kwargs)

Bases: SemanticSegmentation2DTrainer

Convenience trainer pre-configured for TransUNet.

Uses the R50-ViT-B/16 hybrid encoder with a Cascaded UPsampler (CUP) decoder from Chen et al. (2021). The backbone is timm’s vit_base_r50_s16_224, which is a drop-in match for the architecture described in the paper.

Example:

trainer = TransUNetTrainer(
    project='BUS_Segmentation',
)
results = trainer.fit()

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
image_size (int | tuple[int, int] | None)
slice_axis (Literal['axial', 'sagittal', 'coronal'] | int | None)
model (LightningModule | type[LightningModule] | None)
in_channels (int)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
dataset_kwargs (dict[str, Any] | None)
variant (str)
pretrained (bool)
kwargs (Any)

REQUIRED_IMAGE_SIZE: tuple[int, int] = (224, 224)

class datamint.lightning.trainers.UNETRPPTrainer(dataset=None, project=None, *, patch_crop_size=(128, 128, 128), feature_size=16, num_heads=4, depths=None, sw_overlap=0.5, in_channels=1, model=None, loss_fn=None, batch_size=1, num_workers=4, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, dataset_kwargs=None, **kwargs)

Bases: VolumeSegmentationTrainer

Convenience trainer pre-configured for UNETR++.

UNETR++ is a true 3-D segmentation model built on a hierarchical transformer encoder with Efficient Paired Attention (EPA) and a CNN decoder with skip connections.

Reference: Shaker et al., “UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation”, IEEE TMI 2024.

Example:

trainer = UNETRPPTrainer(project='CT_Liver')
results = trainer.fit()

Parameters:

dataset (DatamintBaseDataset | None) – A pre-built DatamintBaseDataset. Mutually exclusive with project.
project (str | Project | None) – Project name or Project object.
patch_crop_size (tuple[int, int, int]) – (D, H, W) patch randomly cropped from each volume during training and used as the sliding-window patch size at eval. Must be divisible by 32 in each dimension. Default (128, 128, 128).
feature_size (int) – Base channel width F. Encoder stage dims are [2F, 4F, 8F, 16F]. Default 16 matches the original paper.
num_heads (int) – Number of attention heads in EPA. Default 4.
depths (list[int] | None) – Number of transformer blocks per encoder stage (list of 4 ints). Default [3, 3, 3, 3] from the paper’s Synapse config.
sw_overlap (float) – Overlap ratio for sliding-window inference. Higher values reduce tiling artefacts at the cost of more compute. Default 0.5.
in_channels (int) – Number of input image channels (e.g. 1 for CT, 4 for multi-modal MRI). Default 1.
batch_size (int) – Training batch size (number of volumes). Defaults to 1 because 3-D volumes typically differ in spatial size.
loss_fn (Module | None) – Custom loss. Defaults to combined BCE + Dice.
max_epochs (int) – Maximum training epochs.
early_stopping_patience (int | None) – Epochs without val improvement before stop.
mlflow_experiment_name (str | None) – MLflow experiment name (auto-generated if None).
model_name (str | None) – Name for the model in the registry (auto-generated if None).
auto_deploy_adapter (bool) – Auto-generate a deployment adapter after training.
trainer_kwargs (dict[str, Any] | None) – Extra kwargs forwarded to lightning.Trainer.
dataset_kwargs (dict[str, Any] | None) – Extra kwargs forwarded to VolumeDataset.
model (LightningModule | type[LightningModule] | None)
num_workers (int)
split_as_of_timestamp (str | None)
kwargs (Any)

class datamint.lightning.trainers.UNetPPTrainer(dataset=None, project=None, *, image_size=None, slice_axis=None, model=None, in_channels=3, loss_fn=None, batch_size=16, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=1, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, dataset_kwargs=None, encoder_name='resnet34', **kwargs)

Bases: SemanticSegmentation2DTrainer

Convenience trainer pre-configured for UNet++ with stronger augmentations.

Adds elastic transform and grid distortion to the default training pipeline — augmentations that are particularly effective for medical image segmentation.

Example:

trainer = UNetPPTrainer(
    project='BUS_Segmentation',
    encoder_name='resnet34',)
results = trainer.fit()

Parameters:

dataset (DatamintBaseDataset | None)
project (str | Project | None)
image_size (int | tuple[int, int] | None)
slice_axis (Literal['axial', 'sagittal', 'coronal'] | int | None)
model (LightningModule | type[LightningModule] | None)
in_channels (int)
loss_fn (Module | None)
batch_size (int)
num_workers (int)
train_transform (BaseCompose | None)
eval_transform (BaseCompose | None)
split_as_of_timestamp (str | None)
max_epochs (int)
early_stopping_patience (int | None)
mlflow_experiment_name (str | None)
model_name (str | None)
auto_deploy_adapter (bool)
trainer_kwargs (dict[str, Any] | None)
dataset_kwargs (dict[str, Any] | None)
encoder_name (str)
kwargs (Any)

class datamint.lightning.trainers.VolumeSegmentationTrainer(*, patch_crop_size=(128, 128, 128), batch_size=1, **kwargs)

Bases: SegmentationTrainer

Abstract trainer for true 3-D volumetric segmentation.

Uses VolumeDataset directly (no slicing). Subclasses must implement _build_model().

Training uses the full volume (or a random patch cropped inside the Lightning module’s training_step). Eval uses one full volume at a time (eval_batch_size=1); sliding-window inference is handled by the Lightning module.

Parameters:

patch_crop_size (tuple[int, int, int]) – (D, H, W) patch size used during training. Forwarded to the Lightning module for random cropping.
batch_size (int) – Number of volumes per training batch. Defaults to 1 because 3-D volumes cannot be stacked unless they are all the same spatial size — use 1 unless you are certain all your volumes match.
to (All remaining keyword arguments are forwarded)

:param BaseTrainer.:

Example:

class MyTrainer(VolumeSegmentationTrainer):
    def _build_model(self, loss_fn, metrics):
        return MyLightningModule(...)

trainer = MyTrainer(project='CT_Liver')
results = trainer.fit()

Parameters:: kwargs (Any)

class datamint.lightning.trainers.YOLOXTrainer(dataset=None, project=None, *, model_size='s', conf_thre=0.25, nms_thre=0.45, image_size=640, model=None, loss_fn=None, batch_size=8, num_workers=4, train_transform=None, eval_transform=None, split_as_of_timestamp=None, max_epochs=50, early_stopping_patience=10, mlflow_experiment_name=None, model_name=None, auto_deploy_adapter=True, trainer_kwargs=None, dataset_kwargs=None, **kwargs)

Bases: DetectionTrainer

One-liner trainer for anchor-free object detection using YOLOX.

Wraps YOLOX (Apache 2.0) with sensible defaults so detection training requires only a project name and, optionally, a model size:

trainer = YOLOXTrainer(project='thyroid_nodules', model_size='s')
results = trainer.fit()

Parameters:

dataset (DatamintBaseDataset | None) – A pre-built ImageDataset. Mutually exclusive with project.
project (str | Project | None) – Project name or Project object. A ImageDataset is created automatically.
model_size (str) – YOLOX size variant — 'nano', 'tiny', 's', 'm', 'l', or 'x'. Defaults to 's', which balances speed and accuracy for most medical imaging tasks.
conf_thre (float) – Combined objectness × class-confidence threshold applied during NMS at inference time.
nms_thre (float) – IoU threshold for non-maximum suppression.
image_size (int | tuple[int, int]) – Target (H, W) size or a single int for square images. Both training and evaluation images are resized to this before being fed to YOLOX.
batch_size (int) – Training batch size.
num_workers (int) – DataLoader workers.
train_transform (BaseCompose | None) – Custom albumentations transform for training. When None the trainer uses its built-in augmentation pipeline.
eval_transform (BaseCompose | None) – Custom albumentations transform for val/test. When None the trainer uses resize + normalize.
max_epochs (int) – Maximum training epochs.
early_stopping_patience (int | None) – Patience for early stopping on val/map. Set to None to disable.
mlflow_experiment_name (str | None) – MLflow experiment name.
model_name (str | None) – Name for the model in the registry.
split_as_of_timestamp (str | None) – Historical timestamp for reproducible splits.
auto_deploy_adapter (bool) – Auto-log a deploy adapter after training.
trainer_kwargs (dict[str, Any] | None) – Extra kwargs forwarded to lightning.Trainer.
dataset_kwargs (dict[str, Any] | None) – Extra kwargs forwarded to ImageDataset.
model (LightningModule | type[LightningModule] | None)
loss_fn (Module | None)
kwargs (Any)