Dataset Classes
The datamint.dataset module provides specialised PyTorch-compatible dataset
classes for different medical imaging modalities. Import them directly:
from datamint.dataset import ImageDataset, VolumeDataset, VideoDataset
Dataset Classes Overview
Split Modes
All dataset classes inherit
split(), which supports
three split modes:
Local random splitting with ratio kwargs such as
train=0.7.Project-scoped split assignments resolved through
api.projects.get_splits().Legacy
split:*resource tags, which remain available for backwards compatibility but are deprecated.
When you call split() without an explicit mode, the client chooses the
mode automatically:
If ratio kwargs are provided, a local random split is used.
If no ratios are provided and the dataset was loaded from a project, project-scoped splits are used.
Otherwise, legacy
split:*resource tags are used.
from datamint.dataset import ImageDataset
dataset = ImageDataset(project="my-project", include_unannotated=True)
# Project-backed datasets prefer project-scoped assignments.
project_parts = dataset.split()
# Persist and replay the exact historical snapshot later.
snapshot = project_parts["train"].split_as_of_timestamp
replayed_parts = dataset.split(as_of_timestamp=snapshot)
# Force an ad hoc local split instead.
local_parts = dataset.split(train=0.8, val=0.2, seed=42)
To override the automatic selection, pass use_project_splits=True or
use_server_splits=True explicitly. use_server_splits is deprecated and
exists only for compatibility with older tag-based workflows.
Project-scoped splits require the dataset to be loaded from a project and must
not be combined with ratio kwargs. Each resolved subset records
split_name, split_source, and, when applicable,
split_as_of_timestamp so downstream training and MLflow lineage can reuse
the same split snapshot.
Base Classes
DatamintBaseDataset - Abstract base class for all Datamint datasets.
Provides the PyTorch Dataset interface with transform support and annotation filtering, while delegating data management to DatamintProjectManager.
- class datamint.dataset.base.DatamintBaseDataset(project=None, resources=None, auto_update=True, api_key=None, server_url=None, return_metainfo=True, return_segmentations=True, return_as_semantic_segmentation=False, semantic_seg_merge_strategy=None, alb_transform=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, allow_external_annotations=False, image_labels_merge_strategy=None, image_categories_merge_strategy=None, worklists='all')
Bases:
ABC,DatasetAbstract base class for Datamint datasets.
This class provides the PyTorch Dataset interface with: - Transform hooks (albumentations) - Annotation filtering - Data loading utilities
Subclasses must implement _get_raw_item() to define how data is loaded.
- Parameters:
project (
str|Project|None) – Project name, Project object, or None. Mutually exclusive with resources.resources (
Sequence[Resource] |None) – List of Resource objects/IDs, or None. Mutually exclusive with project.auto_update (
bool) – If True, sync with server on init.api_key (
str|None) – API key for authentication.server_url (
str|None) – Datamint server URL.all_annotations – If True, include unpublished annotations.
return_metainfo (
bool) – If True, include metadata in output.return_segmentations (
bool) – If True, process and return segmentations.return_as_semantic_segmentation (
bool) – If True, convert to semantic format.semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None) – Strategy for merging multi-annotator segs.alb_transform (
Callable|BaseCompose|None) – Albumentations transform.include_unannotated (
bool) – If True, include resources without annotations.include_annotators (
list[str] |None) – Whitelist of annotators.exclude_annotators (
list[str] |None) – Blacklist of annotators.include_segmentation_names (
list[str] |None) – Whitelist of segmentation labels.exclude_segmentation_names (
list[str] |None) – Blacklist of segmentation labels.include_image_label_names (
list[str] |None) – Whitelist of image labels.exclude_image_label_names (
list[str] |None) – Blacklist of image labels.include_frame_label_names (
list[str] |None) – Whitelist of frame labels.exclude_frame_label_names (
list[str] |None) – Blacklist of frame labels.allow_external_annotations (
bool) – If True, allow and automatically include annotation labels that are not part of the project’s official schema (e.g., labels from other projects or legacy annotations). If False, these annotations will be filtered out.image_labels_merge_strategy (
Literal['union','intersection','mode'] |None)image_categories_merge_strategy (
Literal['union','intersection','mode'] |None)worklists (
Sequence[AnnotationWorklist] |Literal['all'] |None)
- __add__(other)
Concatenate datasets.
- Parameters:
other (
DatamintBaseDataset)- Return type:
ConcatDataset
- __getitem__(index)
Get item with full processing.
- Parameters:
index (
int)- Return type:
dict[str,Any]
- __iter__()
Iterate over dataset.
- Return type:
Iterator[dict[str,Any]]
- __len__()
Dataset length.
- Return type:
int
- add_transform(alb_transform)
- Parameters:
alb_transform (
BaseCompose)- Return type:
None
- abstractmethod apply_alb_transform(img, segmentations)
Apply albumentations transform to image and masks.
- Return type:
dict[str,Any]- Returns:
- Dict with transformed ‘image’ and ‘segmentations’ (dict).
It is recommended that ‘image’ has shape (C, depth, H, W) and each segmentation of ‘segmentations’ has shape (num_instances, depth, H, W), so that common downstream processing can be applied. If not, please override
_process_segmentations()accordingly.
- Parameters:
img (
ndarray)segmentations (
dict[str,ndarray])
- build_mlflow_dataset()
Create a
DatamintMLflowDatasetfor this dataset.- Return type:
- Returns:
An MLflow dataset wrapper for the current dataset.
- filter(*, tags=None, filename_pattern=None, has_annotations=None, annotation_names=None, custom_fn=None)
Return a new dataset containing only resources that match all specified criteria.
This method is chainable — the returned dataset supports the same interface, so you can write:
filtered = dataset.filter(tags=['busi']).filter(has_annotations=True)
or combine with
split():parts = dataset.filter(tags=['ultrasound']).split(train=0.8, test=0.2)
- Parameters:
tags (
list[str] |None) – Keep resources whose tags contain any of the given values.filename_pattern (
str|None) – Keep resources whose filename matches this pattern (interpreted as a glob pattern, usingfnmatch()internally).has_annotations (
bool|None) – IfTrue, keep only resources with at least one annotation. IfFalse, keep only those without annotations.annotation_names (
list[str] |None) – Keep resources that have at least one annotation whoseidentifieris in this list.custom_fn (
Callable[[Resource,Sequence[Annotation]],bool] |None) – Arbitrary predicate receiving(resource, annotations)and returningTrueto keep the resource.
- Return type:
- Returns:
A new
DatamintBaseDatasetcontaining only the matching resources.- Raises:
ValueError – If no filter criteria are specified.
- property frame_labels_set: list[str]
Frame-level label names.
- get_collate_fn()
Get collate function for DataLoader.
- Return type:
Callable[[list[dict]],dict]
- get_dataloader(*args, **kwargs)
Get DataLoader with proper collate function.
- Return type:
DataLoader
- get_resource(index)
Get the Resource object for a given index.
- Parameters:
index (
int)- Return type:
- property image_categories_set: list[tuple[str, str]]
Image-level classification category names/values.
- property image_labels_set: list[str]
Image-level label names.
- prefetch(*, include_annotations=False)
Download and cache dataset files eagerly.
Ensures that resource file bytes are present in the local cache before training begins, so
__getitem__calls are served from disk rather than triggering on-demand network requests. Wheninclude_annotationsis enabled, segmentation annotation payloads are cached too so DataLoader workers do not need to fetch them from the API.Calls
_prepare()implicitly if the dataset has not been initialised yet.- Parameters:
include_annotations (
bool) – Whether to also prefetch segmentation annotation payloads.- Return type:
None
- resource_annotations: list[Sequence[Annotation]]
- property segmentation_labels_set: list[str]
Segmentation label names.
- set_transform(alb_transform=None)
Set transforms after initialization.
- Parameters:
alb_transform (
BaseCompose|None)- Return type:
None
- split(*, seed=None, use_server_splits=None, use_project_splits=None, as_of_timestamp=None, **splits)
Split the dataset into multiple named subsets.
The mode is selected automatically when no explicit split mode is given:
If ratio kwargs are provided (e.g.
train=0.7), local splitting is used.If no ratio kwargs are provided and the dataset was loaded from a project, project-scoped split assignments are used.
Otherwise, server-side
split:*tags on resources are used.
Examples:
# Local split parts = dataset.split(train=0.7, val=0.15, test=0.15, seed=42) train_ds = parts['train'] # Project-scoped split — inferred for project-backed datasets parts = dataset.split() # Explicit override parts = dataset.split(use_project_splits=True)
- Parameters:
seed (
int|None) – Random seed for reproducible local splitting.use_project_splits (
bool|None) – IfTrue, read split assignments from the project splits API. IfNone(default), project-backed datasets prefer this mode when no ratios are provided.as_of_timestamp (
str|None) – Historical timestamp to resolve project-scoped splits against. When omitted for project-scoped splits, the current UTC timestamp is captured and stored on the resolved split datasets for later reuse.use_server_splits (
bool|None) – (DEPRECATED in favor ofuse_project_splits)**splits (
float) – Named split ratios (e.g.train=0.7, test=0.3). Must sum to 1.0 (±0.01 tolerance). Must be empty when use_server_splits or use_project_splits isTrue.
- Return type:
dict[str,DatamintBaseDataset]- Returns:
Dictionary mapping split names to new dataset instances.
- Raises:
ValueError – If ratios are invalid or arguments conflict.
- subset(indices)
Create a dataset subset by slicing resources and annotations.
- Parameters:
indices (
list[int])- Return type:
- exception datamint.dataset.base.DatamintDatasetException
Bases:
DatamintExceptionException raised for dataset errors.
MultiFrameDataset - Abstract base for datasets with multiple frames per resource.
Shared logic for VolumeDataset (3D medical volumes) and VideoDataset (temporal video sequences). Both handle data with shape (C, N, H, W) where N is the number of frames/slices.
- class datamint.dataset.multiframe_dataset.MultiFrameDataset(project=None, resources=None, auto_update=True, api_key=None, server_url=None, return_metainfo=True, return_segmentations=True, return_as_semantic_segmentation=False, semantic_seg_merge_strategy=None, alb_transform=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, allow_external_annotations=False, image_labels_merge_strategy=None, image_categories_merge_strategy=None, worklists='all')
Bases:
DatamintBaseDatasetAbstract base for multi-frame datasets.
Handles loading and augmenting data with shape
(C, N, H, W)whereNis the number of frames (temporal for video) or slices (spatial for volumes).Subclasses add modality-specific features: -
VolumeDataset: anatomical slicing via.slice()-VideoDataset: frame-by-frame iteration via.frame_by_frame()- Parameters:
project (
str|Project|None)resources (
Sequence[Resource] |None)auto_update (
bool)api_key (
str|None)server_url (
str|None)return_metainfo (
bool)return_segmentations (
bool)return_as_semantic_segmentation (
bool)semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None)alb_transform (
Callable|BaseCompose|None)include_unannotated (
bool)include_annotators (
list[str] |None)exclude_annotators (
list[str] |None)include_segmentation_names (
list[str] |None)exclude_segmentation_names (
list[str] |None)include_image_label_names (
list[str] |None)exclude_image_label_names (
list[str] |None)include_frame_label_names (
list[str] |None)exclude_frame_label_names (
list[str] |None)allow_external_annotations (
bool)image_labels_merge_strategy (
Literal['union','intersection','mode'] |None)image_categories_merge_strategy (
Literal['union','intersection','mode'] |None)worklists (
Sequence[AnnotationWorklist] |Literal['all'] |None)
- apply_alb_transform(img, segmentations)
Apply albumentations transform to 4D image and masks.
- Parameters:
img (
ndarray) – Image array of shape(C, depth, H, W).segmentations (
dict[str,ndarray]) – Dict of author -> mask arrays of shape(#instances, depth, H, W).
- Return type:
dict[str,Any]- Returns:
Dict with transformed
'image'and'segmentations'.
Specialised Datasets
ImageDataset
ImageDataset - Dataset for 2D images.
Handles standard 2D medical images like X-rays, pathology patches, single-frame DICOM, PNG, JPEG, etc.
- class datamint.dataset.image_dataset.ImageDataset(project=None, resources=None, auto_update=True, api_key=None, server_url=None, return_metainfo=True, return_segmentations=True, return_as_semantic_segmentation=False, semantic_seg_merge_strategy=None, alb_transform=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, allow_external_annotations=False, image_labels_merge_strategy=None, image_categories_merge_strategy=None, worklists='all')
Bases:
VolumeDatasetDataset for 2D medical images.
- Parameters:
project (
str|Project|None)resources (
Sequence[Resource] |None)auto_update (
bool)api_key (
str|None)server_url (
str|None)return_metainfo (
bool)return_segmentations (
bool)return_as_semantic_segmentation (
bool)semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None)alb_transform (
Callable|BaseCompose|None)include_unannotated (
bool)include_annotators (
list[str] |None)exclude_annotators (
list[str] |None)include_segmentation_names (
list[str] |None)exclude_segmentation_names (
list[str] |None)include_image_label_names (
list[str] |None)exclude_image_label_names (
list[str] |None)include_frame_label_names (
list[str] |None)exclude_frame_label_names (
list[str] |None)allow_external_annotations (
bool)image_labels_merge_strategy (
Literal['union','intersection','mode'] |None)image_categories_merge_strategy (
Literal['union','intersection','mode'] |None)worklists (
Sequence[AnnotationWorklist] |Literal['all'] |None)
- apply_alb_transform(img, segmentations)
Apply albumentations transform to 4D image and masks.
- Parameters:
img (
ndarray) – Image array of shape(C, depth, H, W).segmentations (
dict[str,ndarray]) – Dict of author -> mask arrays of shape(#instances, depth, H, W).
- Return type:
dict[str,Any]- Returns:
Dict with transformed
'image'and'segmentations'.
VolumeDataset
VolumeDataset - Dataset for 3D medical volumes.
Handles NIfTI volumes, DICOM series, and other 3D medical imaging data with support for different slice orientations and affine preservation.
- class datamint.dataset.volume_dataset.VolumeDataset(project=None, resources=None, auto_update=True, api_key=None, server_url=None, return_metainfo=True, return_segmentations=True, return_as_semantic_segmentation=False, semantic_seg_merge_strategy=None, alb_transform=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, allow_external_annotations=False, image_labels_merge_strategy=None, image_categories_merge_strategy=None, worklists='all')
Bases:
MultiFrameDatasetDataset for 3D medical volumes.
Handles NIfTI (3D/4D), DICOM series, and other volumetric data. Inherits multi-frame loading and augmentation from
MultiFrameDataset.- Parameters:
project (
str|Project|None)resources (
Sequence[Resource] |None)auto_update (
bool)api_key (
str|None)server_url (
str|None)return_metainfo (
bool)return_segmentations (
bool)return_as_semantic_segmentation (
bool)semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None)alb_transform (
Callable|BaseCompose|None)include_unannotated (
bool)include_annotators (
list[str] |None)exclude_annotators (
list[str] |None)include_segmentation_names (
list[str] |None)exclude_segmentation_names (
list[str] |None)include_image_label_names (
list[str] |None)exclude_image_label_names (
list[str] |None)include_frame_label_names (
list[str] |None)exclude_frame_label_names (
list[str] |None)allow_external_annotations (
bool)image_labels_merge_strategy (
Literal['union','intersection','mode'] |None)image_categories_merge_strategy (
Literal['union','intersection','mode'] |None)worklists (
Sequence[AnnotationWorklist] |Literal['all'] |None)
- slice(axis='axial')
Create a 2D dataset by slicing this volume along an axis.
Each 3D volume is expanded into multiple 2D slices, one per depth index along the given axis. The returned dataset yields 2D items with shape
(C, H, W)instead of(C, D, H, W).Parsed volumes are cached to disk as gzip-compressed
.npy.gzfiles. A shared in-memory LRU cache also keeps recently used full volumes to avoid repeated decompression when iterating neighboring slices.- Parameters:
axis (
str|int) – Slice orientation. One of'axial'(depth),'coronal'(height),'sagittal'(width), or an integer axis index (0–2).- Return type:
- Returns:
A
SlicedVolumeDatasetthat iterates over individual 2D slices.
Example:
vol_ds = VolumeDataset(project='my_ct_project') sliced = vol_ds.slice(axis='axial') print(len(sliced)) # total number of axial slices across all volumes item = sliced[0] print(item['image'].shape) # (C, H, W)
VideoDataset
VideoDataset - Dataset for video medical data.
Handles video files (MP4, AVI, etc.) and multi-frame DICOM data from modalities like ultrasound (US), angiography (XA), and fluoroscopy (RF).
- class datamint.dataset.video_dataset.VideoDataset(project=None, resources=None, auto_update=True, api_key=None, server_url=None, return_metainfo=True, return_segmentations=True, return_as_semantic_segmentation=False, semantic_seg_merge_strategy=None, alb_transform=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, allow_external_annotations=False, image_labels_merge_strategy=None, image_categories_merge_strategy=None, worklists='all')
Bases:
MultiFrameDatasetDataset for video medical data.
Each item is a full video with shape
(C, N, H, W)whereNis the number of frames. Inherits multi-frame loading and augmentation fromMultiFrameDataset.Supports video files (MP4, AVI, MOV) and multi-frame DICOM from temporal modalities (ultrasound, angiography, fluoroscopy).
Example:
ds = VideoDataset(project='my_ultrasound_project') item = ds[0] print(item['image'].shape) # (C, N, H, W) # Iterate frame-by-frame frame_ds = ds.frame_by_frame() print(frame_ds[0]['image'].shape) # (C, H, W)
- Parameters:
project (
str|Project|None)resources (
Sequence[Resource] |None)auto_update (
bool)api_key (
str|None)server_url (
str|None)return_metainfo (
bool)return_segmentations (
bool)return_as_semantic_segmentation (
bool)semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None)alb_transform (
Callable|BaseCompose|None)include_unannotated (
bool)include_annotators (
list[str] |None)exclude_annotators (
list[str] |None)include_segmentation_names (
list[str] |None)exclude_segmentation_names (
list[str] |None)include_image_label_names (
list[str] |None)exclude_image_label_names (
list[str] |None)include_frame_label_names (
list[str] |None)exclude_frame_label_names (
list[str] |None)allow_external_annotations (
bool)image_labels_merge_strategy (
Literal['union','intersection','mode'] |None)image_categories_merge_strategy (
Literal['union','intersection','mode'] |None)worklists (
Sequence[AnnotationWorklist] |Literal['all'] |None)
- frame_by_frame()
Create a 2D dataset iterating over individual video frames.
Each video is expanded into
Nindividual frames. The returned dataset yields 2D items with shape(C, H, W)instead of(C, N, H, W).Parsed frames are cached to disk as gzip-compressed
.npy.gzfiles.- Return type:
- Returns:
A
SlicedVideoDatasetthat iterates over individual frames.
Example:
vid_ds = VideoDataset(project='my_ultrasound_project') frame_ds = vid_ds.frame_by_frame() print(len(frame_ds)) # total number of frames across all videos item = frame_ds[0] print(item['image'].shape) # (C, H, W)
Sliced Datasets
SlicedVolumeDataset
SlicedVolumeDataset - 2D dataset created by slicing a VolumeDataset along an axis.
Provides a way to iterate over individual 2D slices from 3D volume data, enabling training of 2D models on volumetric medical imaging data.
- class datamint.dataset.sliced_dataset.SlicedVolumeDataset(*, slice_axis='axial', parent_dataset=None, **kwargs)
Bases:
DatamintBaseDataset2D dataset created by slicing a VolumeDataset along an axis.
Each item corresponds to a single 2D slice from a 3D volume. The
__getitem__returns arrays with shape(C, H, W)for images and(num_instances, H, W)or(num_labels+1, H, W)for segmentations.Can be instantiated directly with all the same parameters as
DatamintBaseDatasetplusslice_axis, or created from an already-loaded dataset via thefrom_dataset()factory classmethod (which avoids additional server calls).- Parameters:
project – Project name, Project object, or None. Mutually exclusive with resources.
resources – List of Resource objects, or None. Mutually exclusive with project.
slice_axis (
Literal['axial','sagittal','coronal'] |int) – Slice orientation. One of'axial'(depth),'coronal'(height),'sagittal'(width), or an integer axis index (0–2).
:param See
DatamintBaseDatasetfor all remaining parameters.:- Parameters:
parent_dataset (
DatamintBaseDataset|None)
- __getitem__(index)
Get a 2D slice item with full processing.
Returns dict with: - ‘image’: np.ndarray or Tensor of shape (C, H, W). - ‘segmentations’ (if enabled): segmentation masks with depth dimension removed. - ‘image_labels’: dict of annotator -> label tensor.
- Parameters:
index (
int)- Return type:
dict[str,Any]
- apply_alb_transform(img, segmentations)
Apply 2D albumentations transform to a single-slice image and masks.
Uses the same approach as ImageDataset: treats the data as 2D.
- Parameters:
img (
ndarray) – Image array of shape (C, 1, H, W) or (C, H, W).segmentations (
dict[str,ndarray]) – Dict of author -> mask arrays of shape (#instances, 1, H, W) or (#instances, H, W).
- Return type:
dict[str,Any]- Returns:
Dict with transformed ‘image’ and ‘segmentations’.
- classmethod from_dataset(parent_dataset, slice_axis='axial')
Create a SlicedVolumeDataset from an existing dataset without additional server calls.
Copies all configuration, label mappings, and already-loaded resources from
parent_dataset, then expands them into per-slice proxy resources. Use this factory when you already have a loaded dataset and want to obtain 2D slices without triggering new API requests.- Parameters:
parent_dataset (
DatamintBaseDataset) – The sourceDatamintBaseDataset(e.g. VolumeDataset) providing resources, annotations, and configuration.slice_axis (
Literal['axial','sagittal','coronal'] |int) – Slice orientation. One of'axial'(depth),'coronal'(height),'sagittal'(width), or an integer axis index (0–2).
- Return type:
- Returns:
A new
SlicedVolumeDatasetinstance.
SlicedVideoDataset
SlicedVideoDataset - 2D dataset created by iterating over frames of a VideoDataset.
Provides a way to iterate over individual 2D frames from video data, enabling training of 2D models on temporal medical imaging data.
- class datamint.dataset.sliced_video_dataset.SlicedVideoDataset(*args, **kwargs)
Bases:
DatamintBaseDataset2D dataset created by iterating over frames of a video.
Each item corresponds to a single frame from a video. The
__getitem__returns arrays with shape(C, H, W)for images and(num_instances, H, W)or(num_labels+1, H, W)for segmentations.Can be instantiated directly with all the same parameters as
DatamintBaseDataset, or created from an already-loaded dataset via thefrom_dataset()factory classmethod (which avoids additional server calls).- __getitem__(index)
Get a single frame item with full processing.
Returns dict with: -
'image': np.ndarray or Tensor of shape(C, H, W). -'segmentations'(if enabled): segmentation masks of shape(num_instances, H, W)or(num_labels+1, H, W). -'image_labels': dict of annotator -> label tensor.- Parameters:
index (
int)- Return type:
dict[str,Any]
- apply_alb_transform(img, segmentations)
Apply 2D albumentations transform to a single frame and masks.
- Parameters:
img (
ndarray) – Image array of shape(C, H, W).segmentations (
dict[str,ndarray]) – Dict of author -> mask arrays of shape(#instances, 1, H, W)or(#instances, H, W).
- Return type:
dict[str,Any]- Returns:
Dict with transformed
'image'and'segmentations'.
- classmethod from_dataset(parent_dataset)
Create a SlicedVideoDataset from an existing dataset without additional server calls.
Copies all configuration, label mappings, and already-loaded resources from
parent_dataset, then expands them into per-frame proxy resources.- Parameters:
parent_dataset (
DatamintBaseDataset) – The source dataset (e.g. VideoDataset).- Return type:
- Returns:
A new
SlicedVideoDatasetinstance.
Annotation Processing
- class datamint.dataset.annotation.Annotation(id, identifier, scope, annotation_type, resource_id, created_by, annotation_worklist_id=None, status=None, frame_index=None, text_value=None, numeric_value=None, units=None, geometry=<factory>, created_at=None, approved_at=None, approved_by=None, associated_file=None, file=None, deleted=False, deleted_at=None, deleted_by=None, created_by_model=None, old_geometry=None, set_name=None, resource_filename=None, resource_modality=None, annotation_worklist_name=None, user_info=None, values=None)
Class representing an annotation from the Datamint API.
This class stores annotation data and provides methods for loading and saving annotations through the API handler.
- Parameters:
id (
str) – Unique identifier for the annotationidentifier (
str) – The annotation identifier/label namescope (
str) – Whether annotation applies to ‘frame’ or ‘image’annotation_type (
str) – Type of annotation (‘segmentation’, ‘label’, ‘category’, etc.)resource_id (
str) – ID of the resource this annotation belongs toannotation_worklist_id (
str|None) – ID of the annotation worklistcreated_by (
str) – Email of the user who created the annotationstatus (
str|None) – Status of the annotation (‘published’, ‘new’, etc.)frame_index (
int|None) – Frame index for frame-scoped annotationstext_value (
str|None) – Text value for category annotationsnumeric_value (
float|None) – Numeric value for numeric annotationsunits (
str|None) – Units for numeric annotationsgeometry (
list[Any]) – Geometry data for geometric annotationscreated_at (
str|None) – When the annotation was createdapproved_at (
str|None) – When the annotation was approvedapproved_by (
str|None) – Who approved the annotationassociated_file (
str|None) – Path to associated file (for segmentations)deleted (
bool) – Whether the annotation is deleteddeleted_at (
str|None) – When the annotation was deleteddeleted_by (
str|None) – Who deleted the annotationcreated_by_model (
str|None) – Model ID if created by AIold_geometry (
Any|None) – Previous geometry dataset_name (
str|None) – Set name for grouped annotationsresource_filename (
str|None) – Filename of the associated resourceresource_modality (
str|None) – Modality of the associated resourceannotation_worklist_name (
str|None) – Name of the annotation worklistuser_info (
dict[str,str] |None) – Information about the user who created the annotationvalues (
Any|None) – Additional valuesfile (
str|None)
- __repr__()
String representation of the annotation.
- Return type:
str
- property added_by: str
Get the creator email (alias for created_by).
- annotation_type: str
- annotation_worklist_id: str | None = None
- annotation_worklist_name: str | None = None
- approved_at: str | None = None
- approved_by: str | None = None
- associated_file: str | None = None
- created_at: str | None = None
- created_by: str
- created_by_model: str | None = None
- deleted: bool = False
- deleted_at: str | None = None
- deleted_by: str | None = None
- file: str | None = None
- frame_index: int | None = None
- classmethod from_dict(data)
Create an Annotation instance from a dictionary.
- Parameters:
data (
dict[str,Any]) – Dictionary containing annotation data from API- Return type:
- Returns:
Annotation instance
- geometry: list[Any]
- get_created_datetime()
Get the creation datetime as a datetime object.
- Return type:
datetime|None- Returns:
datetime object or None if created_at is not set
- id: str
- identifier: str
- property index: int | None
Get the frame index (alias for frame_index).
- is_category()
Check if this is a category annotation.
- Return type:
bool
- is_frame_scoped()
Check if this annotation is frame-scoped.
- Return type:
bool
- is_image_scoped()
Check if this annotation is image-scoped.
- Return type:
bool
- is_label()
Check if this is a label annotation.
- Return type:
bool
- is_segmentation()
Check if this is a segmentation annotation.
- Return type:
bool
- property name: str
Get the annotation name (alias for identifier).
- numeric_value: float | None = None
- old_geometry: Any | None = None
- resource_filename: str | None = None
- resource_id: str
- resource_modality: str | None = None
- scope: str
- set_name: str | None = None
- status: str | None = None
- text_value: str | None = None
- to_dict()
Convert the annotation to a dictionary format.
- Return type:
dict[str,Any]- Returns:
Dictionary representation of the annotation
- property type: str
Get the annotation type.
- units: str | None = None
- user_info: dict[str, str] | None = None
- property value: str | None
Get the annotation value (for category annotations).
- values: Any | None = None
AnnotationProcessor - Handles segmentation and label processing.
This module provides annotation processing classes for different dataset types: - BaseAnnotationProcessor: Generic processor with shared logic for all dataset types - ImageAnnotationProcessor: Processor for simple 2D images (no frame/slot concept) - SequenceAnnotationProcessor: Extended processor for multi-frame/multi-slot data (videos, volumes)
The class hierarchy ensures that the base class contains only generic logic that works for any dataset type, while specialized logic is in subclasses.
- class datamint.dataset.annotation_processor.AnnotationProcessor(seglabel2code, image_labels_set, image_lcodes, allow_external_annotations=False)
Base processor for annotations - contains only generic shared logic.
This class provides generic annotation processing that works for any dataset type: - Loading segmentation data from annotations (raw load, no frame handling) - Generic merging strategies for semantic segmentations - Label name conversion utilities - Annotation filtering utilities
Subclasses (ImageAnnotationProcessor, SequenceAnnotationProcessor) handle dataset-specific logic like frame/slot assignment and dimension handling.
- Parameters:
seglabel2code (
dict[str,int]) – Mapping from label name to code.image_labels_set (
list[str]) – List of image-level label names.image_lcodes (
dict[str,dict[str,int]]) – Mapping for image labels.allow_external_annotations (
bool)
- apply_merge_strategy(segmentations, strategy)
- Overloads:
self, segmentations (dict[str, Tensor]), strategy (MergeStrategy) → Tensor
self, segmentations (dict[str, np.ndarray]), strategy (MergeStrategy) → np.ndarray
Merge semantic segmentations from multiple annotators.
- Parameters:
segmentations (
dict[str,Tensor] |dict[str,ndarray]) – Dict of author -> semantic segmentation tensor.strategy (
Literal['union','intersection','mode']) – Merge strategy (‘union’, ‘intersection’, ‘mode’).
- Returns:
Merged tensor if strategy is specified, otherwise original dict.
- Return type:
Tensor | ndarray
- collate_frame_segmentations(fr_anns, depth=None)
- Parameters:
fr_anns (
Sequence[Annotation])depth (
int|None)
- Return type:
tuple[ndarray|None,int]
- convert_image_categories(annotations)
Convert image-level category annotations to class index tensors.
For multiclass classification, we expect exclusively one valid category per image (per user), representing the target class index in CrossEntropyLoss. If multiple categories exist, the first one encountered is used.
- Parameters:
annotations (
Sequence[Annotation]) – List of category annotations (image-scoped).- Return type:
dict[str,Tensor]- Returns:
Dict of annotator_id -> 0-D long tensor containing the class index.
- convert_image_labels(annotations)
Convert image-level label annotations to one-hot tensors.
- Parameters:
annotations (
Sequence[Annotation]) – List of label annotations (image-scoped).- Return type:
dict[str,Tensor]- Returns:
Dict of annotator_id -> one-hot tensor of shape (num_labels,).
- static filter_annotations(annotations, type='all', scope='all')
Filter annotations by type and scope.
- Parameters:
annotations (
Sequence[Annotation]) – List of annotations.type (
Literal['label','category','segmentation','all']) – Filter by annotation type.scope (
Literal['frame','image','all']) – Filter by scope (frame/image).
- Return type:
list[Annotation]- Returns:
Filtered list of annotations.
- static get_author(ann)
Return a consistent author key for an annotation.
Prefers
created_by, falls back tocreated_by_model, then"unknown".- Parameters:
ann (
Annotation)- Return type:
str
- group_annotations(annotations, by_author=False, by_identifier=False)
Group annotations by author and/or identifier.
- Parameters:
annotations (
Iterable[Annotation]) – Iterable of Annotation objects.by_author (
bool) – If True, group by author.by_identifier (
bool) – If True, group by identifier.
- Return type:
dict[tuple,list[Annotation]]- Returns:
Dict mapping grouping keys to lists of annotations.
- instance_to_semantic_segmentation(segmentations, seg_labels, num_labels)
- Overloads:
self, segmentations (None), seg_labels (Tensor | np.ndarray), num_labels (int) → None
self, segmentations (Tensor), seg_labels (Tensor), num_labels (int) → Tensor
self, segmentations (np.ndarray), seg_labels (np.ndarray), num_labels (int) → np.ndarray
Convert instance segmentation to semantic segmentation for a sequence.
- Parameters:
segmentations (
Tensor|ndarray|None) – Tensor/array of shape (num_instances, depth, H, W).seg_labels (
Tensor|ndarray) – Tensor/array of shape (num_instances,).num_labels (
int)
- Returns:
Tensor/array of shape (num_labels+1, depth, H, W); If segmentations is None: None; If segmentations is a Tensor/array: Tensor/array of shape (num_labels+1, depth, H, W).
- Return type:
If segmentations is a Sequence
- load_frame_segmentations(annotations)
Load frame-level segmentations
- Parameters:
annotations (
Iterable[Annotation]) – Iterable of Annotation objects (segmentation type).- Returns:
segmentations: dict[author -> list of np.ndarray of shape (#num_instances, #frames, H, W)]
seg_labels: dict[author -> list of int codes]
seg_anns: dict[author -> list of list of Annotation objects]
- Return type:
tuple[dict[str,list],dict[str,list],dict[str,list]]
- load_image_segmentations(annotations)
Load segmentations defined at image scope. :type annotations:
Iterable[Annotation] :param annotations: Iterable of Annotation objects (segmentation type).- Returns:
segmentations: dict[author -> list of mask arrays of shape (#slices, H, W)]
seg_labels: dict[author -> list of int codes]
seg_anns: dict[author -> list of Annotation objects]
- Return type:
tuple[dict[str,list],dict[str,list],dict[str,list]]
- load_segmentation_data(ann, auto_convert_gray=True)
Load segmentation data from an annotation.
- Parameters:
ann (
Annotation) – The annotation to load data from.auto_convert_gray (
bool) – If True, convert multi-channel grayscale to single channel.
- Returns:
- Binary segmentation array with shape (N, H, W).
For image-level: N=#frames or #slices or depth For frame-level: N=1
- Return type:
ndarray
- load_segmentations(annotations)
Load segmentations for multi-slot data (videos, volumes).
- Parameters:
annotations (
Iterable[Annotation]) – Iterable of Annotation objects (segmentation type).- Returns:
segmentations: dict[author -> np.ndarray of shape (#num_instances, depth or #slices or #frames, H, W)]
seg_labels: dict[author -> np.ndarray of #num_instances ints]
seg_metainfos: dict[author -> list of Annotation objects]
- Return type:
tuple[dict[str,ndarray],dict[str,ndarray],dict[str,list]]
- static merge_image_categories(categories_by_user, strategy, num_categories)
Merge per-annotator category tensors into a single tensor.
For
'mode', returns a scalar long tensor with the majority class index (-1 if empty). For'union'and'intersection', returns a multi-hot int tensor of shape(num_categories,).- Parameters:
categories_by_user (
dict[str,Tensor]) – Dict of annotator_id -> scalar long tensor (class index).strategy (
Literal['union','intersection','mode']) – One of ‘union’, ‘intersection’, or ‘mode’.num_categories (
int) – Total number of (identifier, value) category classes.
- Return type:
Tensor- Returns:
Scalar long tensor for ‘mode’; multi-hot int tensor for ‘union’/’intersection’.
- merge_image_labels(labels_by_user, strategy)
Merge per-annotator label tensors into a single binary tensor.
- Parameters:
labels_by_user (
dict[str,Tensor]) – Dict of annotator_id -> binary label tensor of shape (num_labels,).strategy (
Literal['union','intersection','mode']) – One of ‘union’, ‘intersection’, or ‘mode’.
- Return type:
Tensor- Returns:
Merged label tensor of shape (num_labels,), dtype int32.
- resolve_seg_code(identifier)
Resolve a segmentation label name to its integer code.
If the label is unknown and
allow_external_annotationsis True, a new code is assigned and stored inseglabel2code. Otherwise raisesValueError.- Parameters:
identifier (
str) – Segmentation label name.- Return type:
int- Returns:
Integer code for the label.
Legacy Classes (Deprecated)
Deprecated since version The: classes below are kept for backwards compatibility and may be removed in a
future release. Use ImageDataset or
VolumeDataset instead.
- class datamint.dataset.dataset.DatamintDataset(project_name, root=None, auto_update=True, api_key=None, server_url=None, return_dicom=False, return_metainfo=True, return_frame_by_frame=False, return_annotations=True, return_segmentations=True, return_as_semantic_segmentation=False, image_transform=None, mask_transform=None, alb_transform=None, semantic_seg_merge_strategy=None, include_unannotated=True, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None, all_annotations=False)
Bases:
DatamintBaseDatasetThis Dataset class extends the DatamintBaseDataset class to be easily used with PyTorch. In addition to that, it has functionality to better process annotations and segmentations.
Note
Import using
from datamint import Dataset.- Parameters:
root (
str|None) – Root directory of dataset where data already exists or will be downloaded.project_name (
str) – Name of the project to download.auto_update (
bool) – If True, the dataset will be checked for updates and downloaded if necessary.api_key (
str|None) – API key to access the Datamint API. If not provided, it will look for the environment variable ‘DATAMINT_API_KEY’. Not necessary if you don’t want to download/update the dataset.return_dicom (
bool) – If True, the DICOM object will be returned, if the image is a DICOM file.return_metainfo (
bool) – If True, the metainfo of the image will be returned.return_annotations (
bool) – If True, the annotations of the image will be returned.return_frame_by_frame (
bool) – If True, each frame of a video/DICOM/3d-image will be returned separately.include_unannotated (
bool) – If True, images without annotations will be included. If False, images without annotations will be discarded.all_annotations (
bool) – If True, all annotations will be downloaded, including the ones that are not set as closed/done.server_url (
str|None) – URL of the Datamint server. If not provided, it will use the default server.return_segmentations (
bool) – If True (default), the segmentations of the image will be returned in the ‘segmentations’ key.return_as_semantic_segmentation (
bool) – If True, the segmentations will be returned as semantic segmentation.image_transform (
Callable[[Tensor],Any] |None) – A function to transform the image.mask_transform (
Callable[[Tensor],Any] |None) – A function to transform the mask.semantic_seg_merge_strategy (
Literal['union','intersection','mode'] |None) – If not None, the segmentations will be merged using this strategy. Possible values are ‘union’, ‘intersection’, ‘mode’.include_annotators (
list[str] |None) – List of annotators to include. If None, all annotators will be included. See parameterexclude_annotators.exclude_annotators (
list[str] |None) – List of annotators to exclude. If None, no annotators will be excluded. See parameterinclude_annotators.include_segmentation_names (
list[str] |None) – List of segmentation names to include. If None, all segmentations will be included.exclude_segmentation_names (
list[str] |None) – List of segmentation names to exclude. If None, no segmentations will be excluded.include_image_label_names (
list[str] |None) – List of image label names to include. If None, all image labels will be included.exclude_image_label_names (
list[str] |None) – List of image label names to exclude. If None, no image labels will be excluded.include_frame_label_names (
list[str] |None) – List of frame label names to include. If None, all frame labels will be included.exclude_frame_label_names (
list[str] |None) – List of frame label names to exclude. If None, no frame labels will be excluded.all_annotations – If True, all annotations will be downloaded, including the ones that are not set as closed/done.
alb_transform (
BasicTransform|None)
- __getitem__(index)
Get the item at the given index.
- Parameters:
index (int) – Index of the item to return.
- Returns:
A dictionary with the following keys:
’image’ (Tensor): Tensor of shape (C, H, W) or (N, C, H, W), depending on self.return_frame_by_frame. If self.return_as_semantic_segmentation is True, the image is a tensor of shape (N, L, H, W) or (L, H, W), where L is the number of segmentation labels + 1 (background):
L=len(self.segmentation_labels_set)+1.’metainfo’ (dict): Dictionary with metadata information.
’segmentations’ (dict[str, list[Tensor]] or dict[str,Tensor] or Tensor): Segmentation masks, depending on the configuration of parameters self.return_segmentations, self.return_as_semantic_segmentation, self.return_frame_by_frame, self.semantic_seg_merge_strategy.
’seg_labels’ (dict[str, list[Tensor]] or Tensor): Segmentation labels with the same length as segmentations.
’frame_labels’ (dict[str, Tensor]): Frame-level labels.
’image_labels’ (dict[str, Tensor]): Image-level labels.
- Return type:
dict[str,Any]
- apply_semantic_seg_merge_strategy(segmentations, nframes, h, w)
- Parameters:
segmentations (
dict[str,Tensor])nframes (
int)
- Return type:
Tensor|dict[str,Tensor]
- class datamint.dataset.base_dataset.DatamintBaseDataset(project_name, root=None, auto_update=True, api_key=None, server_url=None, return_dicom=False, return_metainfo=True, return_annotations=True, return_frame_by_frame=False, include_unannotated=True, all_annotations=False, include_annotators=None, exclude_annotators=None, include_segmentation_names=None, exclude_segmentation_names=None, include_image_label_names=None, exclude_image_label_names=None, include_frame_label_names=None, exclude_frame_label_names=None)
Class to download and load datasets from the Datamint API.
- Parameters:
project_name (
str) – Name of the project to download.root (
str|None) – Root directory of dataset where data already exists or will be downloaded.auto_update (
bool) – If True, the dataset will be checked for updates and downloaded if necessary.api_key (
str|None) – API key to access the Datamint API. If not provided, it will look for the environment variable ‘DATAMINT_API_KEY’. Not necessary if you don’t want to download/update the dataset.return_dicom (
bool) – If True, the DICOM object will be returned, if the image is a DICOM file.return_metainfo (
bool) – If True, the metainfo of the image will be returned.return_annotations (
bool) – If True, the annotations of the image will be returned.return_frame_by_frame (
bool) – If True, each frame of a video/DICOM/3d-image will be returned separately.include_unannotated (
bool) – If True, images without annotations will be included.all_annotations (
bool) – If True, all annotations will be downloaded, including the ones that are not set as closed/done.server_url (
str|None) – URL of the Datamint server. If not provided, it will use the default server.include_annotators (
list[str] |None) – List of annotators to include. If None, all annotators will be included.exclude_annotators (
list[str] |None) – List of annotators to exclude. If None, no annotators will be excluded.include_segmentation_names (
list[str] |None) – List of segmentation names to include. If None, all segmentations will be included.exclude_segmentation_names (
list[str] |None) – List of segmentation names to exclude. If None, no segmentations will be excluded.include_image_label_names (
list[str] |None) – List of image label names to include. If None, all image labels will be included.exclude_image_label_names (
list[str] |None) – List of image label names to exclude. If None, no image labels will be excluded.include_frame_label_names (
list[str] |None) – List of frame label names to include. If None, all frame labels will be included.exclude_frame_label_names (
list[str] |None) – List of frame label names to exclude. If None, no frame labels will be excluded.
- DATAMINT_DATASETS_DIR = 'datasets'
- __add__(other)
Concatenate datasets.
- __getitem__(index)
Get item at index.
- Parameters:
index (
int) – Index- Return type:
dict[str,Tensor|FileDataset|dict|list]- Returns:
A dictionary containing ‘image’, ‘metainfo’ and ‘annotations’ keys.
- __iter__()
Iterate over dataset items.
- __len__()
Return dataset length.
- Return type:
int
- __repr__()
String representation of the dataset.
- Return type:
str
- property frame_categories_set: list[tuple[str, str]]
Returns the set of categories in the dataset (multi-class tasks).
- property frame_labels_set: list[str]
Returns the set of independent labels in the dataset (multi-label tasks).
- get_annotations(index, type='all', scope='all')
Returns the annotations of the image at the given index.
- Parameters:
index (
int) – Index of the image.type (
Literal['label','category','segmentation','all']) – The type of the annotations. Can be ‘label’, ‘category’, ‘segmentation’ or ‘all’.scope (
Literal['frame','image','all']) – The scope of the annotations. Can be ‘frame’, ‘image’ or ‘all’.
- Return type:
list[Annotation]- Returns:
The annotations of the image.
- get_collate_fn()
Get collate function for DataLoader.
- Return type:
Callable
- get_dataloader(*args, **kwargs)
Returns a DataLoader for the dataset with proper collate function.
- Parameters:
*args – Positional arguments for the DataLoader.
**kwargs – Keyword arguments for the DataLoader.
- Return type:
DataLoader- Returns:
DataLoader instance with custom collate function.
- get_framelabel_distribution(normalize=False)
Returns the distribution of frame labels in the dataset.
- Parameters:
normalize (
bool)- Return type:
dict[str,float]
- get_info()
Get project information from API.
- Return type:
dict
- get_resources_ids()
Get list of resource IDs.
- Return type:
list[str]
- get_segmentationlabel_distribution(normalize=False)
Returns the distribution of segmentation labels in the dataset.
- Parameters:
normalize (
bool)- Return type:
dict[str,float]
- property image_categories_set: list[tuple[str, str]]
Returns the set of categories in the dataset (multi-class tasks).
- property image_labels_set: list[str]
Returns the set of independent labels in the dataset (multi-label tasks).
- static read_number_of_frames(filepath)
Read the number of frames in a file.
- Parameters:
filepath (
str)- Return type:
int
- property segmentation_labels_set: list[str]
Returns the set of segmentation labels in the dataset.
- subset(indices)
Create a subset of the dataset.
- Parameters:
indices (
list[int]) – List of indices to include in the subset.- Return type:
- Returns:
Self with updated subset indices.
- exception datamint.dataset.base_dataset.DatamintDatasetException