Entities

The datamint.entities module provides the core data structures that represent various objects within the DataMint ecosystem. These entities are built using Pydantic models, ensuring robust data validation, type safety, and seamless serialization/deserialization when interacting with the DataMint API.

DataMint entities package.

class datamint.entities.Annotation(*, id, identifier, scope, frame_index, annotation_type, text_value, numeric_value, units, geometry, created_at, created_by, annotation_worklist_id, status, approved_at, approved_by, resource_id, associated_file, deleted, deleted_at, deleted_by, created_by_model, set_name, resource_filename, resource_modality, annotation_worklist_name, user_info, values='MISSING_FIELD', file=None, **data)

Bases: BaseEntity

Pydantic Model representing a DataMint annotation.

id

Unique identifier for the annotation.

identifier

User-friendly identifier or label for the annotation.

scope

Scope of the annotation (e.g., “frame”, “image”).

frame_index

Index of the frame if scope is frame-based.

annotation_type

Type of annotation (e.g., “segmentation”, “bbox”, “label”).

text_value

Optional text value associated with the annotation.

numeric_value

Optional numeric value associated with the annotation.

units

Optional units for numeric_value.

geometry

Optional geometry payload (e.g., polygons, masks) as a list.

created_at

ISO timestamp for when the annotation was created.

created_by

Email or identifier of the creating user.

annotation_worklist_id

Optional worklist ID associated with the annotation.

status

Lifecycle status of the annotation (e.g., “new”, “approved”).

approved_at

Optional ISO timestamp for approval time.

approved_by

Optional identifier of the approver.

resource_id

ID of the resource this annotation belongs to.

associated_file

Path or identifier of any associated file artifact.

deleted

Whether the annotation is marked as deleted.

deleted_at

Optional ISO timestamp for deletion time.

deleted_by

Optional identifier of the user who deleted the annotation.

created_by_model

Optional identifier of the model that created this annotation.

old_geometry

Optional previous geometry payload for change tracking.

set_name

Optional set name this annotation belongs to.

resource_filename

Optional filename of the resource.

resource_modality

Optional modality of the resource (e.g., CT, MR).

annotation_worklist_name

Optional worklist name associated with the annotation.

user_info

Optional user information with keys like firstname and lastname.

values

Optional extra values payload for flexible schemas.

property added_by: str

Get the creator email (alias for created_by).

annotation_type: AnnotationType
annotation_worklist_id: str | None
annotation_worklist_name: str | None
approved_at: str | None
approved_by: str | None
associated_file: str | None
created_at: str
created_by: str
created_by_model: str | None
deleted: bool
deleted_at: str | None
deleted_by: str | None
fetch_file_data(save_path=None, auto_convert=True, use_cache=False)
Parameters:
  • save_path (PathLike | str | None)

  • auto_convert (bool)

  • use_cache (bool)

Return type:

bytes | pydicom.dataset.Dataset | Image.Image | cv2.VideoCapture | nib_FileBasedImage

file: str | None
frame_index: int | None
classmethod from_dict(data)

Create an Annotation instance from a dictionary.

Parameters:

data (dict[str, Any]) – Dictionary containing annotation data from API

Return type:

Annotation

Returns:

Annotation instance

geometry: list | dict | None
get_created_datetime()

Get the creation datetime as a datetime object.

Return type:

datetime | None

Returns:

datetime object or None if created_at is not set

id: str
identifier: str
property index: int | None

Get the frame index (alias for frame_index).

invalidate_cache()

Invalidate all cached data for this annotation.

Return type:

None

is_category()

Check if this is a category annotation.

Return type:

bool

is_frame_scoped()

Check if this annotation is frame-scoped.

Return type:

bool

is_image_scoped()

Check if this annotation is image-scoped.

Return type:

bool

is_label()

Check if this is a label annotation.

Return type:

bool

is_segmentation()

Check if this is a segmentation annotation.

Return type:

bool

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

property name: str

Get the annotation name (alias for identifier).

numeric_value: float | int | None
property resource: Resource

Lazily load and cache the associated Resource entity.

resource_filename: str | None
resource_id: str
resource_modality: str | None
scope: str
set_name: str | None
status: str
text_value: str | None
property type: str

Alias for annotation_type.

units: str | None
user_info: dict | None
property value: str | None

Get the annotation value (for category annotations).

values: list | None
class datamint.entities.BaseEntity(**data)

Bases: BaseModel

Base class for all entities in the Datamint system.

This class provides common functionality for all entities, such as serialization and deserialization from dictionaries, as well as handling unknown fields gracefully.

The API client is automatically injected by the Api class when entities are created through API endpoints.

Parameters:

data (Any)

asdict()

Convert the entity to a dictionary, including unknown fields.

Return type:

dict[str, Any]

asjson()

Convert the entity to a JSON string, including unknown fields.

Return type:

str

static is_attr_missing(value)

Check if a value is the MISSING_FIELD sentinel.

Parameters:

value (Any)

Return type:

bool

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

class datamint.entities.CacheManager(entity_type, cache_root=None)

Bases: Generic[T]

Manages local caching of entity data with versioning support.

This class handles storing and retrieving cached data with automatic validation against server versions to ensure data consistency.

The cache uses a directory structure: - cache_root/

  • resources/ - {resource_id}/

    • image_data.pkl

    • metadata.json

  • annotations/ - {annotation_id}/

    • segmentation_data.pkl

    • metadata.json

cache_root

Root directory for cache storage

entity_type

Type of entity being cached (e.g., ‘resources’, ‘annotations’)

Parameters:
  • entity_type (str)

  • cache_root (Path | str | None)

class ItemMetadata(**data)

Bases: BaseModel

Parameters:

data (Any)

cached_at: datetime
data_path: str
data_type: str
entity_id: str | None
mimetype: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

version_hash: str | None
version_info: dict | None
clear_all()

Clear all cached data for this entity type.

Return type:

None

get(entity_id, data_key, version_info=None)

Retrieve cached data for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • version_info (dict[str, Any] | None) – Optional version information from server to validate cache

Return type:

TypeVar(T) | None

Returns:

Cached data if valid, None if cache miss or invalid

get_cache_info(entity_id)

Get information about cached data for an entity.

Parameters:

entity_id (str) – Unique identifier for the entity

Return type:

dict[str, Any]

Returns:

Dictionary containing cache information

invalidate(entity_id, data_key=None)

Invalidate cached data for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str | None) – Optional key for specific data. If None, invalidates all data for entity.

Return type:

None

set(entity_id, data_key, data, version_info=None)

Store data in cache for an entity.

Parameters:
  • entity_id (str) – Unique identifier for the entity

  • data_key (str) – Key identifying the type of data

  • data (TypeVar(T)) – Data to cache

  • version_info (dict[str, Any] | None) – Optional version information from server

Return type:

None

class datamint.entities.Channel(**data)

Bases: BaseEntity

Represents a channel containing multiple resources.

A channel is a collection of resources grouped together, typically for batch processing or organization purposes.

channel_name

Name identifier for the channel.

resource_data

List of resources contained in this channel.

deleted

Whether the channel has been marked as deleted.

created_at

Timestamp when the channel was created.

updated_at

Timestamp when the channel was last updated.

Parameters:

data (Any)

channel_name: str
created_at: str | None
deleted: bool
get_resource_ids()

Get list of all resource IDs in this channel.

Return type:

list[str]

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

resource_data: list[ChannelResourceData]
updated_at: str | None
class datamint.entities.ChannelResourceData(**data)

Bases: BaseModel

Represents resource data within a channel.

created_by

Email of the user who created the resource.

customer_id

UUID of the customer.

resource_id

UUID of the resource.

resource_file_name

Original filename of the resource.

resource_mimetype

MIME type of the resource.

Parameters:

data (Any)

created_by: str
customer_id: str
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resource_file_name: str
resource_id: str
resource_mimetype: str
class datamint.entities.DatasetInfo(*, id, name, created_at, created_by, description, customer_id, updated_at, total_resource, resource_ids, **data)

Bases: BaseEntity

Pydantic Model representing a DataMint dataset.

This class provides access to dataset information and related entities like resources and projects.

created_at: str
created_by: str
customer_id: str
description: str
get_projects(api=None, refresh=False)

Get all projects associated with this dataset.

Results are cached after the first call unless refresh=True.

Parameters:

refresh (bool) – If True, bypass cache and fetch fresh data

Return type:

Sequence[Project]

Returns:

List of Project instances

Raises:

RuntimeError – If no API client is available

Example

>>> dataset = api.datasetsinfo.get_by_id("dataset-id")
>>> projects = dataset.get_projects()
Parameters:

api (Api | None)

get_resources(refresh=False, limit=None)

Get all resources in this dataset.

Results are cached after the first call unless refresh=True.

Parameters:
  • api – Optional API client. Uses the one from set_api() if not provided.

  • refresh (bool) – If True, bypass cache and fetch fresh data

Return type:

Sequence[Resource]

Returns:

List of Resource instances in this dataset

Raises:

RuntimeError – If no API client is available

Example

>>> dataset = api._datasetsinfo.get_by_id("dataset-id")
>>> dataset.set_api(api)
>>> resources = dataset.get_resources()
Parameters:

limit (int | None)

id: str
invalidate_cache()

Invalidate all cached relationship data.

This forces fresh data fetches on the next access.

Return type:

None

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

name: str
resource_ids: list[str]
total_resource: int
updated_at: str | None
class datamint.entities.Project(**data)

Bases: BaseEntity

Pydantic Model representing a DataMint project.

This class models a project entity from the DataMint API, containing information about the project, its dataset, worklist, AI model, and annotation statistics.

id

Unique identifier for the project

name

Human-readable name of the project

description

Optional description of the project

created_at

ISO timestamp when the project was created

created_by

Email of the user who created the project

dataset_id

ID of the associated dataset

worklist_id

ID of the associated worklist

ai_model_id

Optional ID of the associated AI model

viewable_ai_segs

Optional configuration for viewable AI segments

editable_ai_segs

Optional configuration for editable AI segments

archived

Whether the project is archived

resource_count

Total number of resources in the project

annotated_resource_count

Number of resources that have been annotated

most_recent_experiment

Optional information about the most recent experiment

closed_resources_count

Number of resources marked as closed/completed

resources_to_annotate_count

Number of resources still needing annotation

annotators

List of annotators assigned to this project

Parameters:

data (Any)

ai_model_id: str | None
annotated_resource_count: int
annotators: list[dict]
archived: bool
archived_by: str | None
archived_on: str | None
as_torch_dataset(root_dir=None, auto_update=True, return_as_semantic_segmentation=False)
Parameters:
  • root_dir (str | None)

  • auto_update (bool)

  • return_as_semantic_segmentation (bool)

closed_resources_count: int
created_at: str
created_by: str
dataset_id: str
description: str | None
editable_ai_segs: list | None
fetch_resources()

Fetch resources associated with this project from the API, IMPORTANT: It always fetches fresh data from the server.

Return type:

Sequence[Resource]

Returns:

List of Resource instances associated with the project.

id: str
is_active_learning: bool
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

most_recent_experiment: str | None
name: str
registered_model: Any | None
require_review: bool
resource_count: int
resources_to_annotate_count: int
set_work_status(resource, status)

Set the status of a resource.

Parameters:
  • resource (Resource) – The resource unique id or a resource object.

  • status (Literal['opened', 'annotated', 'closed']) – The new status to set.

Return type:

None

show()

Open the project in the default web browser.

Return type:

None

two_up_display: bool
property url: str

Get the URL to access this project in the DataMint web application.

viewable_ai_segs: list | None
worklist_id: str
class datamint.entities.Resource(*, id, resource_uri, storage, location, upload_channel, filename, modality, mimetype, size, upload_mechanism, customer_id, status, created_at, created_by, published, deleted, source_filepath, metadata, projects='MISSING_FIELD', published_on, published_by, tags=None, publish_transforms=None, deleted_at=None, deleted_by=None, instance_uid=None, series_uid=None, study_uid=None, patient_id=None, segmentations=None, measurements=None, categories=None, user_info=None, **data)

Bases: BaseEntity

Represents a DataMint resource with all its properties and metadata.

This class models a resource entity from the DataMint API, containing information about uploaded files, their metadata, and associated projects.

id

Unique identifier for the resource

resource_uri

URI path to access the resource file

storage

Storage type (e.g., ‘DicomResource’)

location

Storage location path

upload_channel

Channel used for upload (e.g., ‘tmp’)

filename

Original filename of the resource

modality

Medical imaging modality

mimetype

MIME type of the file

size

File size in bytes

upload_mechanism

Mechanism used for upload (e.g., ‘api’)

customer_id

Customer/organization identifier

status

Current status of the resource

created_at

ISO timestamp when resource was created

created_by

Email of the user who created the resource

published

Whether the resource is published

published_on

ISO timestamp when resource was published

published_by

Email of the user who published the resource

publish_transforms

Optional publication transforms

deleted

Whether the resource is deleted

deleted_at

Optional ISO timestamp when resource was deleted

deleted_by

Optional email of the user who deleted the resource

metadata

Resource metadata with DICOM information

source_filepath

Original source file path

tags

List of tags associated with the resource

instance_uid

DICOM SOP Instance UID (top-level)

series_uid

DICOM Series Instance UID (top-level)

study_uid

DICOM Study Instance UID (top-level)

patient_id

Patient identifier (top-level)

segmentations

Optional segmentation data

measurements

Optional measurement data

categories

Optional category data

labels

List of labels associated with the resource

user_info

Information about the user who created the resource

projects

List of projects this resource belongs to

__repr__()

Detailed string representation of the resource.

Return type:

str

Returns:

Detailed string representation for debugging

__str__()

String representation of the resource.

Return type:

str

Returns:

Human-readable string describing the resource

categories: Any | None
created_at: str
created_by: str
customer_id: str
deleted: bool
deleted_at: str | None
deleted_by: str | None
fetch_annotations(annotation_type=None)

Get annotations associated with this resource.

Parameters:

annotation_type (AnnotationType | str | None)

Return type:

Sequence[Annotation]

fetch_file_data(auto_convert=True, save_path=None, use_cache=False)

Get the file data for this resource.

This method automatically caches the file data locally. On subsequent calls, it checks the server for changes and uses cached data if unchanged.

Parameters:
  • use_cache (bool) – If True, uses cached data when available and valid

  • auto_convert (bool) – If True, automatically converts to appropriate format (pydicom.Dataset, PIL Image, etc.)

  • save_path (str | None) – Optional path to save the file locally

Return type:

bytes | pydicom.dataset.Dataset | Image.Image | cv2.VideoCapture | nib_FileBasedImage

Returns:

File data (format depends on auto_convert and file type)

filename: str
id: str
instance_uid: str | None
invalidate_cache()

Invalidate cached data for this resource.

Return type:

None

is_dicom()

Check if the resource is a DICOM file.

Return type:

bool

Returns:

True if the resource is a DICOM file, False otherwise

location: str
measurements: Any | None
metadata: dict
mimetype: str
modality: str
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

patient_id: str | None
projects: list[dict]
publish_transforms: Any | None
published: bool
published_by: str | None
published_on: str | None
resource_uri: str
segmentations: Any | None
series_uid: str | None
show()

Open the resource in the default web browser.

Return type:

None

size: int
property size_mb: float

Get file size in megabytes.

Returns:

File size in MB rounded to 2 decimal places

source_filepath: str | None
status: str
storage: str
study_uid: str | None
tags: list[str] | None
upload_channel: str
upload_mechanism: str
property url: str

Get the URL to access this resource in the DataMint web application.

user_info: dict | None
class datamint.entities.User(**data)

Bases: BaseEntity

User entity model.

email

User email address (unique identifier in most cases).

firstname

First name.

lastname

Last name.

roles

List of role strings assigned to the user.

customer_id

UUID of the owning customer/tenant.

created_at

ISO 8601 timestamp of creation.

Parameters:

data (Any)

created_at: str
customer_id: str
email: str
firstname: str | None
lastname: str | None
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(_BaseEntity__context)

Handle unknown fields by logging a warning once per class/field combination in debug mode.

Parameters:

_BaseEntity__context (Any)

Return type:

None

roles: list[str]