Entities

The datamint.entities module provides the core data structures that represent various objects within the DataMint ecosystem. These entities are built using Pydantic models, ensuring robust data validation, type safety, and seamless serialization/deserialization when interacting with the DataMint API.

In most user code, prefer working with entity instances instead of passing raw IDs between endpoint calls. Project, Resource, and Annotation can usually be passed directly back into API methods, and they expose convenience helpers for common workflows.

Entity-first Workflows

Project objects

from datamint import Api

api = Api()
project = api.projects.get_by_name("Liver Review")

resources = project.fetch_resources()
project.cache_resources(progress_bar=False)

if resources:
   project.set_work_status(resources[0], "annotated")

specs = project.get_annotations_specs()
print([spec.identifier for spec in specs])

Resource objects

resource = api.resources.get_list(project_name="Liver Review")[0]

image = resource.fetch_file_data(auto_convert=True, use_cache=True)
annotations = resource.fetch_annotations(annotation_type="segmentation")

print(resource.filename, len(annotations))

Annotation objects

annotation = resource.fetch_annotations()[0]

annotation_data = annotation.fetch_file_data(use_cache=True)
source_resource = annotation.resource

print(annotation.name, source_resource.filename)

Reference

DataMint entities package.

class datamint.entities.Annotation(*, name, scope, annotation_type, confiability=1.0, id=None, frame_index=None, text_value=None, numeric_value=None, units=None, geometry=None, created_at=None, created_by=None, annotation_worklist_id=None, imported_from=None, import_author=None, status=None, approved_at=None, approved_by=None, resource_id=None, associated_file=None, deleted=False, deleted_at=None, deleted_by=None, created_by_model=None, is_model=None, model_id=None, set_name=None, resource_filename=None, resource_modality=None, annotation_worklist_name=None, user_info=None, values='MISSING_FIELD', file=None, **data)

Bases: AnnotationBase

Pydantic Model representing a DataMint annotation.

id

Unique identifier for the annotation.

identifier

User-friendly identifier or label for the annotation.

scope

Scope of the annotation (e.g., “frame”, “image”).

frame_index

Index of the frame if scope is frame-based.

annotation_type

Type of annotation (e.g., “segmentation”, “bbox”, “label”).

text_value

Optional text value associated with the annotation.

numeric_value

Optional numeric value associated with the annotation.

units

Optional units for numeric_value.

geometry

Optional geometry payload (e.g., polygons) as a list.

created_at

ISO timestamp for when the annotation was created.

created_by

Email or identifier of the creating user.

annotation_worklist_id

Optional worklist ID associated with the annotation.

status

Lifecycle status of the annotation (e.g., “new”, “approved”).

approved_at

Optional ISO timestamp for approval time.

approved_by

Optional identifier of the approver.

resource_id

ID of the resource this annotation belongs to.

associated_file

Path or identifier of any associated file artifact.

deleted

Whether the annotation is marked as deleted.

deleted_at

Optional ISO timestamp for deletion time.

deleted_by

Optional identifier of the user who deleted the annotation.

created_by_model

Optional identifier of the model that created this annotation.

old_geometry

Optional previous geometry payload for change tracking.

set_name

Optional set name this annotation belongs to.

resource_filename

Optional filename of the resource.

resource_modality

Optional modality of the resource (e.g., CT, MR).

annotation_worklist_name

Optional worklist name associated with the annotation.

user_info

Optional user information with keys like firstname and lastname.

values

Optional extra values payload for flexible schemas.

property added_by: str

Get the creator email (alias for created_by).

annotation_worklist_id: str | None
annotation_worklist_name: str | None
approved_at: str | None
approved_by: str | None
associated_file: str | None
created_at: str | None
created_by: str | None
created_by_model: str | None
deleted: bool
deleted_at: str | None
deleted_by: str | None
fetch_file_data(auto_convert=True, save_path=None, use_cache=False)
Overloads:
  • self, auto_convert (Literal[True]), save_path (str | None), use_cache (CacheMode) → ImagingData

  • self, auto_convert (Literal[False]), save_path (str | None), use_cache (CacheMode) → bytes

Get the file data for this annotation.

Parameters:
  • save_path (str | None) – Optional path to save the file locally. If use_cache=True, the file is saved to save_path and cache metadata points to that location (no duplication - only one file on disk).

  • auto_convert (bool) – If True, automatically converts to appropriate format

  • use_cache (bool | Literal['loadonly']) – Cache behavior for this call. Use False to bypass cache entirely, True to read from and save to cache, or "loadonly" to read from cache without saving cache misses.

Returns:

File data (format depends on auto_convert and file type)

Return type:

bytes | ImagingData

Example

>>> annotation = api.annotations.get_list(limit=1)[0]
>>> data = annotation.fetch_file_data(use_cache=True)
>>> data = annotation.fetch_file_data(use_cache="loadonly")
>>> annotation.fetch_file_data(save_path="annotation_file")
file: str | None
frame_index: int | None
classmethod from_dict(data)

Create an Annotation instance from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary containing annotation data from API

Return type:

Annotation

Returns:

Annotation instance

geometry: list | dict | None
get_created_datetime()

Get the creation datetime as a datetime object.

Return type:

datetime | None

Returns:

datetime object or None if created_at is not set

id: str | None
identifier: str
import_author: str | None
imported_from: str | None
property index: int | None

Get the frame index (alias for frame_index).

invalidate_cache()

Invalidate all cached data for this annotation.

Return type:

None

is_frame_scoped()

Check if this annotation is frame-scoped.

Return type:

bool

is_image_scoped()

Check if this annotation is image-scoped.

Return type:

bool

is_model: bool | None
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'populate_by_name': True, 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64', 'validate_by_alias': True, 'validate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_id: str | None
model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

numeric_value: float | int | None
property resource: Resource

Lazily load and cache the associated Resource entity.

Example

>>> annotation = api.annotations.get_list(limit=1)[0]
>>> annotation.resource.filename
resource_filename: str | None
resource_id: str | None
resource_modality: str | None
scope: str
set_name: str | None
status: str | None
text_value: str | None
property type: str

Alias for annotation_type.

units: str | None
user_info: dict | None
property value: str | None

Get the annotation value (for category annotations).

values: list | None
class datamint.entities.AnnotationSpec(**data)

Bases: BaseModel

Base class for annotation specifications. Used by the API to define the expected structure of annotations for a given project.

Parameters:

data (Any)

asdict()

Convert the entity to a dictionary, including unknown fields.

asjson()

Convert the entity to a JSON string, including unknown fields.

Return type:

str

classmethod create(**kwargs)

Factory method to create the appropriate AnnotationSpec subclass based on type.

Return type:

AnnotationSpec

identifier: str
model_config: ClassVar[ConfigDict] = {'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

required: bool
scope: str
type: AnnotationType
class datamint.entities.BaseEntity(**data)

Bases: BaseModel

Base class for all entities in the Datamint system.

This class provides common functionality for all entities, such as serialization and deserialization from dictionaries, as well as handling unknown fields gracefully.

The API client is automatically injected by the Api class when entities are created through API endpoints.

asdict()

Convert the entity to a dictionary, including unknown fields.

Return type:

dict[str, Any]

asjson()

Convert the entity to a JSON string, including unknown fields.

Return type:

str

has_missing_attrs()

Check if the entity has any attributes that are MISSING_FIELD.

Return type:

bool

Returns:

True if any attribute is MISSING_FIELD, False otherwise

is_attr_missing(attr_name)

Check if a value is the MISSING_FIELD sentinel.

Parameters:

attr_name (str)

Return type:

bool

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

class datamint.entities.CacheManager(entity_type, cache_root=None, enable_memory_cache=False, memory_cache_maxsize=2)

Bases: Generic[T]

Manages local caching of entity data with versioning support.

This class handles storing and retrieving cached data with automatic validation against server versions to ensure data consistency.

The cache uses a directory structure: - cache_root/

  • resources/ - {resource_id}/

    • image_data.pkl

    • metadata.json

  • annotations/ - {annotation_id}/

    • segmentation_data.pkl

    • metadata.json

cache_root

Root directory for cache storage

entity_type

Type of entity being cached (e.g., ‘resources’, ‘annotations’)

Parameters:
  • entity_type (str)

  • cache_root (Path | str | None)

  • enable_memory_cache (bool)

  • memory_cache_maxsize (int)

class ItemMetadata(**data)

Bases: BaseModel

Parameters:

data (Any)

cached_at: datetime
data_path: str
data_type: str
entity_id: str | None
extra_info: dict | None
mimetype: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

version_hash: str | None
version_info: dict | None
clear_all()

Clear all cached data for this entity type.

Return type:

None

get(entity_id, data_key, version_info=None)

Retrieve cached data for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • version_info (dict[str, Any] | None) – Optional version information from server to validate cache

Return type:

TypeVar(T) | None

Returns:

Cached data if valid, None if cache miss or invalid

get_cache_info(entity_id)

Get information about cached data for an entity.

Parameters:

entity_id (str) – Unique identifier for the entity

Return type:

dict[str, Any]

Returns:

Dictionary containing cache information

get_expected_path(entity_id, data_key)

Get the expected cache path for an entity (even if not yet cached).

This is useful for downloading directly to the cache location.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

Return type:

Path

Returns:

Path where data will be cached

get_memory(entity_id, data_key, version_info=None)
Parameters:
  • entity_id (str)

  • data_key (str)

  • version_info (dict[str, Any] | None)

Return type:

TypeVar(T) | None

get_path(entity_id, data_key, version_info=None)

Get the path to cached data for an entity if valid.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • version_info (dict[str, Any] | None) – Optional version information from server to validate cache

Return type:

Path | None

Returns:

Path to cached data if valid, None if cache miss or invalid

invalidate(entity_id, data_key=None)

Invalidate cached data for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str | None) – Optional key for specific data. If None, invalidates all data for entity.

Return type:

None

invalidate_memory(entity_id, data_key=None)
Parameters:
  • entity_id (str)

  • data_key (str | None)

Return type:

None

iter_entities_extra_info()

Yield (entity_id, extra_info) for all cached entities that have extra_info.

Yields:

Tuples of (entity_id, extra_info dict) for entities with stored extra_info.

register_file_location(entity_id, data_key, file_path, version_info=None, mimetype='application/octet-stream', data=None)

Register an external file location in cache metadata without copying data.

This allows tracking a file stored at an arbitrary location (e.g., user’s save_path) while keeping version metadata in the cache directory.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • file_path (str | Path) – Path to the external file to register

  • version_info (dict[str, Any] | None) – Optional version information from server

  • mimetype (str) – MIME type of the file data

  • data (TypeVar(T) | None) – Optional data object to populate the in-memory cache immediately.

Return type:

None

save_extra_info(entity_id, extra_info)

Store arbitrary extra metadata alongside the cache entry for an entity.

This is a no-op when no cache entry exists yet for the entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • extra_info (dict) – Arbitrary key-value pairs to store (e.g. upload_channel, tags)

Return type:

None

set(entity_id, data_key, data, version_info=None)

Store data in cache for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • data (TypeVar(T)) – Data to cache

  • version_info (dict[str, Any] | None) – Optional version information from server

Return type:

None

set_memory(entity_id, data_key, data, version_info=None)
Parameters:
  • entity_id (str)

  • data_key (str)

  • data (TypeVar(T))

  • version_info (dict[str, Any] | None)

Return type:

None

class datamint.entities.Channel(*, channel_name, resource_data, deleted=False, created_at=None, updated_at=None, **data)

Bases: BaseEntity

Represents a channel containing multiple resources.

A channel is a collection of resources grouped together, typically for batch processing or organization purposes.

channel_name

Name identifier for the channel.

resource_data

List of resources contained in this channel.

deleted

Whether the channel has been marked as deleted.

created_at

Timestamp when the channel was created.

updated_at

Timestamp when the channel was last updated.

channel_name: str
created_at: str | None
deleted: bool
get_resource_ids()

Get list of all resource IDs in this channel.

Return type:

list[str]

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_data: list[ChannelResourceData]
updated_at: str | None
class datamint.entities.ChannelResourceData(**data)

Bases: BaseModel

Represents resource data within a channel.

created_by

Email of the user who created the resource.

customer_id

UUID of the customer.

resource_id

UUID of the resource.

resource_file_name

Original filename of the resource.

resource_mimetype

MIME type of the resource.

Parameters:

data (Any)

created_by: str
customer_id: str
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resource_file_name: str
resource_id: str
resource_mimetype: str
class datamint.entities.DICOMResource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: VolumeResource

Represents a DICOM resource or assembled DICOM series.

filename_suffixes: ClassVar[tuple[str, ...]] = ('.dcm',)
mimetypes: ClassVar[tuple[str, ...]] = ('application/dicom',)
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_kind: ClassVar[str] = 'dicom'
resource_priority: ClassVar[int] = 50
storage_aliases: ClassVar[tuple[str, ...]] = ('DicomResource', 'DicomResourceHandler')
property uids: dict[str, str]

Return the available DICOM UIDs for this resource.

class datamint.entities.DatasetInfo(*, id, name, created_at, created_by, description, customer_id, updated_at, total_resource, resource_ids, **data)

Bases: BaseEntity

Pydantic Model representing a DataMint dataset.

This class provides access to dataset information and related entities like resources and projects.

created_at: str
created_by: str
customer_id: str
description: str
id: str
invalidate_cache()

Invalidate all cached relationship data.

This forces fresh data fetches on the next access.

Return type:

None

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

name: str
resource_ids: list[str]
total_resource: int
updated_at: str | None
class datamint.entities.ImageResource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: Resource

Represents a single-frame 2D image resource.

get_depth()
Return type:

int

get_dimensions()

Get image dimensions as (width, height).

Return type:

tuple[int | None, int | None]

property height: int | None
mimetype_prefixes: ClassVar[tuple[str, ...]] = ('image/',)
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_kind: ClassVar[str] = 'image'
resource_priority: ClassVar[int] = 10
storage_aliases: ClassVar[tuple[str, ...]] = ('ImageResource', 'ImageResourceHandler')
property width: int | None
class datamint.entities.InferenceJob(*, id, status, model_name, resource_id=None, frame_idx=None, created_at=None, started_at=None, completed_at=None, progress_percentage=0, current_step=None, error_message=None, save_results=True, result_data=None, annotation_ids=None, recent_logs=None, **data)

Bases: BaseEntity

Entity representing an inference job.

annotation_ids: list | None
completed_at: str | None
created_at: str | None
current_step: str | None
error_message: str | None
frame_idx: int | None
id: str
property is_finished: bool

Whether the job has reached a terminal state.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_name: str
model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

property predictions: list[list[Annotation]] | None

Returns a list of annotations resulting from this inference job, if available.

Each element of the outer list corresponds to one input resource; the inner list contains the annotations produced for that resource.

Returns:

list[list[Annotation]] (one outer list for each input resource) or None when no predictions are stored in result_data.

progress_percentage: int
recent_logs: list[str] | None
resource_id: str | None
result_data: dict[str, Any] | None
save_results: bool
started_at: str | None
status: str
wait(*, on_status=None, poll_interval=2.0, timeout=None)

Block until this job reaches a terminal state.

Uses the SSE stream when available, falling back to polling. In-place updates to this object are made on every status change.

Parameters:
  • on_status (Callable[[InferenceJob], None] | None) – Optional callback invoked with an updated InferenceJob on every status change.

  • poll_interval (float) – Seconds between polls in polling-fallback mode.

  • timeout (float | None) – Maximum seconds to wait. Raises TimeoutError on expiry.

Return type:

InferenceJob

class datamint.entities.LocalResource(local_filepath=None, raw_data=None, convert_to_bytes=False, **kwargs)

Bases: Resource

Represents a local resource that hasn’t been uploaded to DataMint API yet.

Parameters:
  • local_filepath (str | Path | None)

  • raw_data (bytes | None)

  • convert_to_bytes (bool)

__repr__()

Detailed string representation of the local resource.

Return type:

str

Returns:

Detailed string representation for debugging

__str__()

String representation of the local resource.

Return type:

str

Returns:

Human-readable string describing the local resource

fetch_file_data(*args, auto_convert=True, save_path=None, use_cache=False, **kwargs)
Overloads:
  • self, args, auto_convert (Literal[True]), save_path (str | None), use_cache (CacheMode), kwargsImagingData

  • self, args, auto_convert (Literal[False]), save_path (str | None), use_cache (CacheMode), kwargsbytes

Get the file data for this local resource.

Parameters:
  • auto_convert (bool) – If True, automatically converts to appropriate format (pydicom.Dataset, PIL Image, etc.)

  • save_path (str | None) – Optional path to save the file locally

  • use_cache (bool | Literal['loadonly']) – Ignored for local resources; included for API parity.

Returns:

File data (format depends on auto_convert and file type)

Return type:

bytes | ImagingData

property filepath_cached: Path | None

Get the file path of the local resource data.

Returns:

Path to the local file, or None if only raw data is available.

local_filepath: str | None
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

raw_data: bytes | None
class datamint.entities.NiftiResource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: VolumeResource

Represents a NIfTI volume resource.

filename_suffixes: ClassVar[tuple[str, ...]] = ('.nii', '.nii.gz')
property is_compressed: bool

Whether the underlying NIfTI file is gzip-compressed.

mimetypes: ClassVar[tuple[str, ...]] = ('application/nifti',)
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_kind: ClassVar[str] = 'nifti'
resource_priority: ClassVar[int] = 50
storage_aliases: ClassVar[tuple[str, ...]] = ('NiftiResource', 'NiftiResourceHandler')
class datamint.entities.Project(*, id, name, created_at, created_by, dataset_id, worklist_id, archived, resource_count, annotated_resource_count, description, viewable_ai_segs, editable_ai_segs, registered_model='MISSING_FIELD', ai_model_id='MISSING_FIELD', closed_resources_count='MISSING_FIELD', resources_to_annotate_count='MISSING_FIELD', most_recent_experiment='MISSING_FIELD', annotators='MISSING_FIELD', archived_on='MISSING_FIELD', archived_by='MISSING_FIELD', is_active_learning='MISSING_FIELD', two_up_display='MISSING_FIELD', require_review='MISSING_FIELD', **data)

Bases: BaseEntity

Pydantic Model representing a DataMint project.

This class models a project entity from the DataMint API, containing information about the project, its dataset, worklist, AI model, and annotation statistics.

id

Unique identifier for the project

name

Human-readable name of the project

description

Optional description of the project

created_at

ISO timestamp when the project was created

created_by

Email of the user who created the project

dataset_id

ID of the associated dataset

worklist_id

ID of the associated worklist

ai_model_id

Optional ID of the associated AI model

viewable_ai_segs

Optional configuration for viewable AI segments

editable_ai_segs

Optional configuration for editable AI segments

archived

Whether the project is archived

resource_count

Total number of resources in the project

annotated_resource_count

Number of resources that have been annotated

most_recent_experiment

Optional information about the most recent experiment

closed_resources_count

Number of resources marked as closed/completed

resources_to_annotate_count

Number of resources still needing annotation

annotators

List of annotators assigned to this project

ai_model_id: str | None
annotated_resource_count: int
annotators: list[dict]
archived: bool
archived_by: str | None
archived_on: str | None
as_torch_dataset(root_dir=None, auto_update=True, return_as_semantic_segmentation=False)
Parameters:
  • root_dir (str | None)

  • auto_update (bool)

  • return_as_semantic_segmentation (bool)

cache_resources(progress_bar=True)

Cache all project resources in parallel for faster subsequent access.

This method downloads and caches all resource file data concurrently, skipping resources that are already cached. This dramatically improves performance when working with large projects.

Parameters:

progress_bar (bool) – Whether to show a progress bar. Default is True.

Return type:

None

Example

>>> project = api.projects.get_by_name("My Project")
>>> project.cache_resources(progress_bar=False)
>>> # Now fetch_file_data() will be instantaneous for cached resources
>>> for resource in project.fetch_resources():
...     data = resource.fetch_file_data(use_cache=True)
closed_resources_count: int
created_at: str
created_by: str
dataset_id: str
description: str | None
download_resources_datas(progress_bar=True)

Downloads all project resources in parallel for faster subsequent access.

This method downloads and caches all resource file data concurrently, skipping resources that are already cached. This dramatically improves performance when working with large projects.

Parameters:

progress_bar (bool) – Whether to show a progress bar. Default is True.

Return type:

None

Example

>>> project = api.projects.get_by_name("My Project")
>>> project.download_resources_datas(progress_bar=False)
>>> # Now fetch_file_data() will be instantaneous for cached resources
>>> for resource in project.fetch_resources():
...     data = resource.fetch_file_data(use_cache=True)
editable_ai_segs: list | None
fetch_resources()

Fetch resources associated with this project from the API, IMPORTANT: It always fetches fresh data from the server.

Return type:

Sequence[Resource]

Returns:

List of Resource instances associated with the project.

Example

>>> project = api.projects.get_by_name("My Project")
>>> resources = project.fetch_resources()
>>> [resource.filename for resource in resources]
get_annotations_specs()

Get the annotations specs for this project.

Return type:

Sequence[AnnotationSpec]

Returns:

Sequence of AnnotationSpec instances for the project.

Example

>>> project = api.projects.get_by_name("My Project")
>>> specs = project.get_annotations_specs()
>>> [spec.identifier for spec in specs]
id: str
is_active_learning: bool
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

most_recent_experiment: str | None
name: str
registered_model: Any | None
require_review: bool
resource_count: int
resources_to_annotate_count: int
set_work_status(resource, status)

Set the status of a resource.

Parameters:
  • resource (Resource) – The resource unique id or a resource object.

  • status (Literal['opened', 'annotated', 'closed']) – The new status to set.

Return type:

None

Example

>>> project = api.projects.get_by_name("My Project")
>>> resource = project.fetch_resources()[0]
>>> project.set_work_status(resource, "annotated")
show()

Open the project in the default web browser.

Return type:

None

two_up_display: bool
property url: str

Get the URL to access this project in the DataMint web application.

viewable_ai_segs: list | None
worklist_id: str
class datamint.entities.Resource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: BaseEntity

Represents a DataMint resource with all its properties and metadata.

This class models a resource entity from the DataMint API, containing information about uploaded files, their metadata, and associated projects.

id

Unique identifier for the resource

resource_uri

URI path to access the resource file

storage

Storage type (e.g., ‘DicomResource’)

location

Storage location path

upload_channel

Channel used for upload (e.g., ‘tmp’)

filename

Original filename of the resource

modality

Medical imaging modality

mimetype

MIME type of the file

size

File size in bytes

upload_mechanism

Mechanism used for upload (e.g., ‘api’)

customer_id

Customer/organization identifier

status

Current status of the resource

created_at

ISO timestamp when resource was created

created_by

Email of the user who created the resource

published

Whether the resource is published

published_on

ISO timestamp when resource was published

published_by

Email of the user who published the resource

publish_transforms

Optional publication transforms

deleted

Whether the resource is deleted

deleted_at

Optional ISO timestamp when resource was deleted

deleted_by

Optional email of the user who deleted the resource

metadata

Resource metadata with DICOM information

source_filepath

Original source file path

tags

List of tags associated with the resource

instance_uid

DICOM SOP Instance UID (top-level)

series_uid

DICOM Series Instance UID (top-level)

study_uid

DICOM Study Instance UID (top-level)

patient_id

Patient identifier (top-level)

segmentations

Optional segmentation data

measurements

Optional measurement data

categories

Optional category data

labels

List of labels associated with the resource

user_info

Information about the user who created the resource

projects

List of projects this resource belongs to

__repr__()

Detailed string representation of the resource.

Return type:

str

Returns:

Detailed string representation for debugging

__str__()

String representation of the resource.

Return type:

str

Returns:

Human-readable string describing the resource

created_at: str
created_by: str
customer_id: str
deleted: bool
deleted_at: str | None
deleted_by: str | None
fetch_annotations(annotation_type=None)

Get annotations associated with this resource.

Example

>>> resource = api.resources.get_list(project_name="My Project")[0]
>>> annotations = resource.fetch_annotations(annotation_type="segmentation")
>>> [annotation.name for annotation in annotations]
Parameters:

annotation_type (AnnotationType | str | None)

Return type:

Sequence[Annotation]

fetch_file_data(auto_convert=True, save_path=None, use_cache=False)
Overloads:
  • self, auto_convert (Literal[True]), save_path (str | None), use_cache (CacheMode) → ImagingData

  • self, auto_convert (Literal[False]), save_path (str | None), use_cache (CacheMode) → bytes

Get the file data for this resource.

This method automatically caches the file data locally. On subsequent calls, it checks the server for changes and uses cached data if unchanged.

Parameters:
  • use_cache (bool | Literal['loadonly']) – Cache behavior for this call. Use False to bypass cache entirely, True to read from and save to cache, or "loadonly" to read from cache without saving cache misses.

  • auto_convert (bool) – If True, automatically converts to appropriate format (pydicom.Dataset, PIL Image, etc.)

  • save_path (str | None) – Optional path to save the file locally. If use_cache=True, the file is saved to save_path and cache metadata points to that location (no duplication - only one file on disk).

Returns:

File data (format depends on auto_convert and file type)

Return type:

bytes | ImagingData

Example

>>> resource = api.resources.get_list(project_name="My Project")[0]
>>> data = resource.fetch_file_data(use_cache=True)
>>> data = resource.fetch_file_data(use_cache="loadonly")
>>> resource.fetch_file_data(save_path="local_copy")
filename: str
filename_suffixes: ClassVar[tuple[str, ...]] = ()
property filepath_cached: Path | None

Get the file path of the cached resource data, if available.

Returns:

Path to the cached file data, or None if not cached.

static from_local_file(file_path)

Create a LocalResource instance from a local file path.

Parameters:

file_path (str | Path) – Path to the local file

get_depth()
Return type:

int

get_frame(index)

Get a decoded video frame as a normalized array.

Parameters:

index (int)

Return type:

ndarray

get_frame_resource(index)

Get a proxy object for a specific video frame.

Parameters:

index (int)

Return type:

SlicedVideoResource

get_slice(axis, index)

Get a specific slice of the volume as a SlicedVolumeResource.

Parameters:
  • axis (Literal['axial', 'sagittal', 'coronal']) – The anatomical plane to slice along (e.g., ‘axial’, ‘coronal’, ‘sagittal’)

  • index (int) – The index of the slice along the specified axis

Return type:

ndarray

Returns:

A numpy array representing the specified slice

get_slice_resource(axis, index)

Get a proxy object for a specific volume slice.

Parameters:
  • axis (Literal['axial', 'sagittal', 'coronal'])

  • index (int)

Return type:

SlicedVolumeResource

id: str
instance_uid: str | None
invalidate_cache()

Invalidate cached data for this resource.

Return type:

None

is_cached()

Check if the resource’s file data is already cached locally and valid.

Return type:

bool

Returns:

True if valid cached data exists, False otherwise.

is_dicom()

Check if the resource is a DICOM file.

Return type:

bool

Returns:

True if the resource is a DICOM file, False otherwise

is_image()

Check if the resource is a single-frame image.

Return type:

bool

is_multiframe()

Check if the resource contains multiple frames or slices.

Return type:

bool

is_nifti()

Check if the resource is a NIfTI file.

Return type:

bool

Returns:

True if the resource is a NIfTI file, False otherwise

is_video()

Check if the resource is a video file.

Return type:

bool

Returns:

True if the resource is a video file, False otherwise

is_volume()

Check if the resource is a volumetric resource.

Return type:

bool

iter_frames()

Expand a video into one proxy resource per frame.

Return type:

list[SlicedVideoResource]

iter_slices(axis)

Expand a volume into one proxy resource per slice.

Parameters:

axis (Literal['axial', 'sagittal', 'coronal'])

Return type:

list[SlicedVolumeResource]

property kind: str
location: str
classmethod matches_payload(*, storage=None, mimetype=None, filename=None)
Parameters:
  • storage (str | None)

  • mimetype (str | None)

  • filename (str | None)

Return type:

bool

metadata: dict[str, Any]
mimetype: str
mimetype_prefixes: ClassVar[tuple[str, ...]] = ()
mimetypes: ClassVar[tuple[str, ...]] = ()
modality: str | None
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

patient_id: str | None
published: bool
published_by: str | None
published_on: str | None
resource_kind: ClassVar[str] = 'resource'
resource_priority: ClassVar[int] = 0
resource_uri: str
series_uid: str | None
show()

Open the resource in the default web browser.

Return type:

None

size: int
property size_mb: float

Get file size in megabytes.

Returns:

File size in MB rounded to 2 decimal places

source_filepath: str | None
status: str
storage: str
storage_aliases: ClassVar[tuple[str, ...]] = ()
study_uid: str | None
tags: list[str] | None
upload_channel: str
upload_mechanism: str | None
property url: str

Get the URL to access this resource in the DataMint web application.

user_info: dict[str, str | None] | str
class datamint.entities.User(*, email, firstname, lastname, roles, customer_id, created_at, **data)

Bases: BaseEntity

User entity model.

email

User email address (unique identifier in most cases).

firstname

First name.

lastname

Last name.

roles

List of role strings assigned to the user.

customer_id

UUID of the owning customer/tenant.

created_at

ISO 8601 timestamp of creation.

created_at: str
customer_id: str
email: str
firstname: str | None
lastname: str | None
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

roles: list[str]
class datamint.entities.VideoResource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: Resource

Represents a video resource with per-frame access helpers.

property frame_count: int
get_depth()
Return type:

int

get_dimensions()

Get video frame dimensions as (width, height).

Return type:

tuple[int | None, int | None]

property height: int | None
iter_frames()

Expand a video into one proxy resource per frame.

Return type:

list[SlicedVideoResource]

mimetype_prefixes: ClassVar[tuple[str, ...]] = ('video/',)
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_kind: ClassVar[str] = 'video'
resource_priority: ClassVar[int] = 20
storage_aliases: ClassVar[tuple[str, ...]] = ('VideoResource', 'VideoResourceHandler')
property width: int | None
class datamint.entities.VolumeResource(*, id, resource_uri, storage, location, upload_channel, filename, mimetype, size, customer_id, status, created_at, created_by, published, deleted, upload_mechanism=None, metadata={}, modality=None, source_filepath=None, published_on=None, published_by=None, tags=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, user_info='MISSING_FIELD', **data)

Bases: Resource

Represents a volumetric resource, such as a 3D CT or MRI scan.

property frame_count: int
get_depth()
Return type:

int

get_slice(axis, index)

Get a specific slice of the volume as a SlicedVolumeResource.

Parameters:
  • axis (Literal['axial', 'sagittal', 'coronal']) – The anatomical plane to slice along (e.g., ‘axial’, ‘coronal’, ‘sagittal’)

  • index (int) – The index of the slice along the specified axis

Return type:

ndarray

Returns:

A numpy array representing the specified slice

get_slice_resource(axis, index)

Get a proxy object for a specific volume slice.

Parameters:
  • axis (Literal['axial', 'sagittal', 'coronal'])

  • index (int)

Return type:

SlicedVolumeResource

iter_slices(axis)

Expand a volume into one proxy resource per slice.

Parameters:

axis (Literal['axial', 'sagittal', 'coronal'])

Return type:

list[SlicedVolumeResource]

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'ser_json_bytes': 'base64', 'val_json_bytes': 'base64'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_kind: ClassVar[str] = 'volume'
resource_priority: ClassVar[int] = 30
storage_aliases: ClassVar[tuple[str, ...]] = ('VolumeResource', 'VolumeResourceHandler')