Skip to content

Loaders API

pictologics.loader

Image Loading Module

This module handles the loading of medical images from various formats (NIfTI, DICOM) into a standardized Image class. It abstracts away file format differences to provide a consistent interface for the rest of the library.

Key Features:

  • Unified Image Class: Stores 3D data, spacing, origin, direction, and modality.
  • Format Support:
    • NIfTI (.nii, .nii.gz) via nibabel.
    • DICOM Series (directory of DICOM files) via pydicom.
    • Single DICOM files.
  • Automatic Detection: load_image automatically detects format and dimensionality.
  • Robust DICOM Sorting: Sorts slices based on spatial position and orientation.

Axis Conventions:

All image arrays are stored in (X, Y, Z) order to match ITK/SimpleITK conventions:

  • X (axis 0): Left-Right direction (columns in DICOM terminology)
  • Y (axis 1): Anterior-Posterior direction (rows in DICOM terminology)
  • Z (axis 2): Superior-Inferior direction (slices)

This differs from raw DICOM and matplotlib conventions:

  • DICOM pixel_array: Returns (Rows, Columns) = (Y, X) for 2D slices
  • Matplotlib imshow: Expects (height, width) = (Y, X)

The loaders handle the necessary axis transformations automatically. When using visualization utilities like visualize_mask_overlay(), slices are internally transposed for correct display.

Image dataclass

A standardized container for 3D medical image data and metadata.

This class serves as the common interface for all image processing operations in the library, abstracting away the differences between file formats like DICOM and NIfTI.

Attributes:

Name Type Description
array NDArray[floating[Any]]

The 3D image data with shape (x, y, z).

spacing tuple[float, float, float]

Voxel spacing in millimeters (mm) along the (x, y, z) axes.

origin tuple[float, float, float]

World coordinates of the image origin (center of the first voxel) in millimeters (mm).

direction Optional[NDArray[floating[Any]]]

3x3 direction cosine matrix defining the orientation of the image axes in world space. Defaults to identity matrix.

modality str

The imaging modality (e.g., 'CT', 'MR', 'PT'). Defaults to 'Unknown'.

source_mask Optional[NDArray[bool_]]

Optional boolean mask indicating which voxels contain valid source data (True) vs sentinel/invalid values (False). When set, preprocessing operations like resampling and filtering will exclude invalid voxels from interpolation/convolution to prevent sentinel value contamination. If None, all voxels are assumed to contain valid data (traditional behavior).

Source code in pictologics/loader.py
@dataclass
class Image:
    """
    A standardized container for 3D medical image data and metadata.

    This class serves as the common interface for all image processing operations
    in the library, abstracting away the differences between file formats like
    DICOM and NIfTI.

    Attributes:
        array (npt.NDArray[np.floating[Any]]): The 3D image data with shape (x, y, z).
        spacing (tuple[float, float, float]): Voxel spacing in millimeters (mm)
            along the (x, y, z) axes.
        origin (tuple[float, float, float]): World coordinates of the image origin
            (center of the first voxel) in millimeters (mm).
        direction (Optional[npt.NDArray[np.floating[Any]]]): 3x3 direction cosine matrix defining the
            orientation of the image axes in world space. Defaults to identity matrix.
        modality (str): The imaging modality (e.g., 'CT', 'MR', 'PT'). Defaults to 'Unknown'.
        source_mask (Optional[npt.NDArray[np.bool_]]): Optional boolean mask indicating
            which voxels contain valid source data (True) vs sentinel/invalid values (False).
            When set, preprocessing operations like resampling and filtering will exclude
            invalid voxels from interpolation/convolution to prevent sentinel value contamination.
            If None, all voxels are assumed to contain valid data (traditional behavior).
    """

    array: npt.NDArray[np.floating[Any]]
    spacing: tuple[float, float, float]
    origin: tuple[float, float, float]
    direction: Optional[npt.NDArray[np.floating[Any]]] = None
    modality: str = "Unknown"
    source_mask: Optional[npt.NDArray[np.bool_]] = None

    @property
    def has_source_mask(self) -> bool:
        """Whether this image has a source validity mask (indicating sentinel values were excluded)."""
        return self.source_mask is not None

    def with_source_mask(
        self,
        mask: "npt.NDArray[np.bool_] | npt.NDArray[np.integer[Any]] | Image",
    ) -> "Image":
        """
        Return a copy of this image with a source validity mask applied.

        The source mask indicates which voxels contain valid data (True) vs
        sentinel/invalid values (False). When set, spatial operations like
        resampling and filtering will exclude invalid voxels.

        Args:
            mask: Boolean array, integer array (>0 = valid), or Image object.
                  Must have the same shape as the image array.

        Returns:
            New Image with source_mask set.

        Raises:
            ValueError: If mask shape doesn't match image shape.

        Example:
            ```python
            from pictologics.loader import load_image

            image = load_image("image_with_sentinel.nii.gz")
            roi_mask = load_image("roi_mask.nii.gz")

            # Use ROI mask as source validity mask
            image_with_source = image.with_source_mask(roi_mask)

            # Now resampling will exclude sentinel voxels
            from pictologics.preprocessing import resample_image
            resampled = resample_image(image_with_source, new_spacing=(1, 1, 1))
            ```
        """
        if isinstance(mask, Image):
            # Explicit cast to bool to satisfy type checker
            mask_arr = (mask.array > 0).astype(bool)
        else:
            mask_arr = (mask > 0).astype(bool)

        if mask_arr.shape != self.array.shape:
            raise ValueError(
                f"Source mask shape {mask_arr.shape} must match image shape {self.array.shape}"
            )

        return Image(
            array=self.array.copy(),
            spacing=self.spacing,
            origin=self.origin,
            direction=(
                self.direction if self.direction is None else self.direction.copy()
            ),
            modality=self.modality,
            source_mask=mask_arr.astype(bool),
        )

has_source_mask property

Whether this image has a source validity mask (indicating sentinel values were excluded).

with_source_mask(mask)

Return a copy of this image with a source validity mask applied.

The source mask indicates which voxels contain valid data (True) vs sentinel/invalid values (False). When set, spatial operations like resampling and filtering will exclude invalid voxels.

Parameters:

Name Type Description Default
mask 'npt.NDArray[np.bool_] | npt.NDArray[np.integer[Any]] | Image'

Boolean array, integer array (>0 = valid), or Image object. Must have the same shape as the image array.

required

Returns:

Type Description
'Image'

New Image with source_mask set.

Raises:

Type Description
ValueError

If mask shape doesn't match image shape.

Example
from pictologics.loader import load_image

image = load_image("image_with_sentinel.nii.gz")
roi_mask = load_image("roi_mask.nii.gz")

# Use ROI mask as source validity mask
image_with_source = image.with_source_mask(roi_mask)

# Now resampling will exclude sentinel voxels
from pictologics.preprocessing import resample_image
resampled = resample_image(image_with_source, new_spacing=(1, 1, 1))
Source code in pictologics/loader.py
def with_source_mask(
    self,
    mask: "npt.NDArray[np.bool_] | npt.NDArray[np.integer[Any]] | Image",
) -> "Image":
    """
    Return a copy of this image with a source validity mask applied.

    The source mask indicates which voxels contain valid data (True) vs
    sentinel/invalid values (False). When set, spatial operations like
    resampling and filtering will exclude invalid voxels.

    Args:
        mask: Boolean array, integer array (>0 = valid), or Image object.
              Must have the same shape as the image array.

    Returns:
        New Image with source_mask set.

    Raises:
        ValueError: If mask shape doesn't match image shape.

    Example:
        ```python
        from pictologics.loader import load_image

        image = load_image("image_with_sentinel.nii.gz")
        roi_mask = load_image("roi_mask.nii.gz")

        # Use ROI mask as source validity mask
        image_with_source = image.with_source_mask(roi_mask)

        # Now resampling will exclude sentinel voxels
        from pictologics.preprocessing import resample_image
        resampled = resample_image(image_with_source, new_spacing=(1, 1, 1))
        ```
    """
    if isinstance(mask, Image):
        # Explicit cast to bool to satisfy type checker
        mask_arr = (mask.array > 0).astype(bool)
    else:
        mask_arr = (mask > 0).astype(bool)

    if mask_arr.shape != self.array.shape:
        raise ValueError(
            f"Source mask shape {mask_arr.shape} must match image shape {self.array.shape}"
        )

    return Image(
        array=self.array.copy(),
        spacing=self.spacing,
        origin=self.origin,
        direction=(
            self.direction if self.direction is None else self.direction.copy()
        ),
        modality=self.modality,
        source_mask=mask_arr.astype(bool),
    )

load_image(path, dataset_index=0, recursive=False, reference_image=None, transpose_axes=None, fill_value=0.0, apply_rescale=True)

Load a medical image from a file path or directory.

This is the main entry point for loading data. It automatically detects whether the input is a NIfTI file, DICOM directory/file (single DICOM or series), or a DICOM Segmentation (SEG) object and standardizes it into an Image object.

The resulting image array is always 3D with dimensions (x, y, z).

Note

For DICOM SEG files, this function uses :func:pictologics.loaders.load_seg internally. For more control over segment extraction (e.g., selecting specific segments or extracting them separately), use load_seg() directly.

Parameters:

Name Type Description Default
path str

The absolute or relative path to the image file (e.g., .nii.gz, .dcm or file with no extension) or the directory containing DICOM files.

required
dataset_index int

For multi-volume datasets, specifies which volume to extract (0-indexed). This works for:

  • 4D NIfTI files: Selects which time point/volume to load.
  • Multi-phase DICOM series: Selects which phase to load (e.g., cardiac phases, temporal positions, echo numbers). Use :func:pictologics.utilities.get_dicom_phases to discover available phases.

Defaults to 0 (the first volume/phase).

0
recursive bool

If True and path is a directory, recursively searches subdirectories and loads the DICOM series from the folder containing the most DICOM files. Defaults to False.

False
reference_image Optional[Image]

If provided and the loaded image has different dimensions than the reference, it will be repositioned into the reference coordinate space using spatial metadata (origin, spacing). This is useful for loading cropped segmentation masks that need to match a full-sized image.

None
transpose_axes tuple[int, int, int] | None

Optional axis transposition to apply before repositioning. Use this if the mask's axis order differs from the reference. E.g., (0, 2, 1) swaps Y and Z axes. Only used when reference_image is provided.

None
fill_value float

Fill value for regions outside the loaded image when repositioning (default: 0.0). Only used when reference_image is provided.

0.0
apply_rescale bool

If True (default), apply RescaleSlope and RescaleIntercept transformation for DICOM files to convert stored pixel values to real-world values (e.g., Hounsfield Units for CT). NIfTI files always apply their scaling factors via nibabel's get_fdata(). Set to False if you need raw stored values.

True

Returns:

Name Type Description
Image Image

An Image object containing the 3D numpy array and metadata (spacing, origin, etc.).

Raises:

Type Description
ValueError

If the path does not exist, the file format is not supported, or the file is corrupt/unreadable.

Example

Loading a NIfTI file:

from pictologics.loader import load_image

# Load a standard brain scan
img = load_image("data/brain.nii.gz")
print(f"Image shape: {img.array.shape}")
# Output: Image shape: (256, 256, 128)

Loading a DICOM series:

# Load a CT scan from a folder of DICOM files
img_ct = load_image("data/patients/001/CT_scan/")
print(f"Voxel spacing: {img_ct.spacing}")
# Output: Voxel spacing: (0.97, 0.97, 2.5)

Loading a single DICOM file:

# Load a single DICOM file (even without .dcm extension)
img_slice = load_image("data/slice_001")
print(f"Modality: {img_slice.modality}")

Recursive DICOM loading:

# Finds the deep subfolder with actual DICOM files
img = load_image("data/patients/001/", recursive=True)

Loading a specific volume from a 4D file:

# Load the 5th time point from a 4D fMRI file
fmri_vol = load_image("data/fmri.nii.gz", dataset_index=4)

Loading a cropped mask and repositioning to match main image:

main_img = load_image("ct_scan/")
mask = load_image("cropped_mask.dcm", reference_image=main_img)
# mask now has same shape as main_img

Loading a DICOM SEG file (auto-detected):

# DICOM SEG files are automatically detected and loaded
seg = load_image("segmentation.dcm")
print(f"Modality: {seg.modality}")  # Output: Modality: SEG
# Segments are combined into a label image by default

Loading a specific phase from a multi-phase DICOM series:

from pictologics.utilities import get_dicom_phases

# Discover available phases
phases = get_dicom_phases("cardiac_ct/")
print(f"Found {len(phases)} phases")
for p in phases:
    print(f"  {p.index}: {p.label} ({p.num_slices} slices)")

# Load the 5th phase (40%)
img = load_image("cardiac_ct/", dataset_index=4)

Source code in pictologics/loader.py
def load_image(
    path: str,
    dataset_index: int = 0,
    recursive: bool = False,
    reference_image: Optional[Image] = None,
    transpose_axes: tuple[int, int, int] | None = None,
    fill_value: float = 0.0,
    apply_rescale: bool = True,
) -> Image:
    """
    Load a medical image from a file path or directory.

    This is the main entry point for loading data. It automatically detects whether
    the input is a NIfTI file, DICOM directory/file (single DICOM or series), or
    a DICOM Segmentation (SEG) object and standardizes it into an `Image` object.

    The resulting image array is always 3D with dimensions (x, y, z).

    Note:
        For DICOM SEG files, this function uses :func:`pictologics.loaders.load_seg`
        internally. For more control over segment extraction (e.g., selecting specific
        segments or extracting them separately), use ``load_seg()`` directly.

    Args:
        path (str): The absolute or relative path to the image file (e.g., .nii.gz,
            .dcm or file with no extension) or the directory containing DICOM files.
        dataset_index (int, optional): For multi-volume datasets, specifies which
            volume to extract (0-indexed). This works for:

            - **4D NIfTI files**: Selects which time point/volume to load.
            - **Multi-phase DICOM series**: Selects which phase to load (e.g., cardiac
              phases, temporal positions, echo numbers). Use
              :func:`pictologics.utilities.get_dicom_phases` to discover available phases.

            Defaults to 0 (the first volume/phase).
        recursive (bool, optional): If True and `path` is a directory, recursively searches
            subdirectories and loads the DICOM series from the folder containing the most
            DICOM files. Defaults to False.
        reference_image (Optional[Image]): If provided and the loaded image has different
            dimensions than the reference, it will be repositioned into the reference
            coordinate space using spatial metadata (origin, spacing). This is useful for
            loading cropped segmentation masks that need to match a full-sized image.
        transpose_axes (tuple[int, int, int] | None): Optional axis transposition to apply
            before repositioning. Use this if the mask's axis order differs from the reference.
            E.g., (0, 2, 1) swaps Y and Z axes. Only used when reference_image is provided.
        fill_value (float): Fill value for regions outside the loaded image when
            repositioning (default: 0.0). Only used when reference_image is provided.
        apply_rescale (bool): If True (default), apply RescaleSlope and RescaleIntercept
            transformation for DICOM files to convert stored pixel values to real-world
            values (e.g., Hounsfield Units for CT). NIfTI files always apply their scaling
            factors via nibabel's get_fdata(). Set to False if you need raw stored values.

    Returns:
        Image: An `Image` object containing the 3D numpy array and metadata (spacing, origin, etc.).

    Raises:
        ValueError: If the path does not exist, the file format is not supported,
            or the file is corrupt/unreadable.

    Example:
        **Loading a NIfTI file:**
        ```python
        from pictologics.loader import load_image

        # Load a standard brain scan
        img = load_image("data/brain.nii.gz")
        print(f"Image shape: {img.array.shape}")
        # Output: Image shape: (256, 256, 128)
        ```

        **Loading a DICOM series:**
        ```python
        # Load a CT scan from a folder of DICOM files
        img_ct = load_image("data/patients/001/CT_scan/")
        print(f"Voxel spacing: {img_ct.spacing}")
        # Output: Voxel spacing: (0.97, 0.97, 2.5)
        ```

        **Loading a single DICOM file:**
        ```python
        # Load a single DICOM file (even without .dcm extension)
        img_slice = load_image("data/slice_001")
        print(f"Modality: {img_slice.modality}")
        ```

        **Recursive DICOM loading:**
        ```python
        # Finds the deep subfolder with actual DICOM files
        img = load_image("data/patients/001/", recursive=True)
        ```

        **Loading a specific volume from a 4D file:**
        ```python
        # Load the 5th time point from a 4D fMRI file
        fmri_vol = load_image("data/fmri.nii.gz", dataset_index=4)
        ```

        **Loading a cropped mask and repositioning to match main image:**
        ```python
        main_img = load_image("ct_scan/")
        mask = load_image("cropped_mask.dcm", reference_image=main_img)
        # mask now has same shape as main_img
        ```

        **Loading a DICOM SEG file (auto-detected):**
        ```python
        # DICOM SEG files are automatically detected and loaded
        seg = load_image("segmentation.dcm")
        print(f"Modality: {seg.modality}")  # Output: Modality: SEG
        # Segments are combined into a label image by default
        ```

        **Loading a specific phase from a multi-phase DICOM series:**
        ```python
        from pictologics.utilities import get_dicom_phases

        # Discover available phases
        phases = get_dicom_phases("cardiac_ct/")
        print(f"Found {len(phases)} phases")
        for p in phases:
            print(f"  {p.index}: {p.label} ({p.num_slices} slices)")

        # Load the 5th phase (40%)
        img = load_image("cardiac_ct/", dataset_index=4)
        ```
    """
    path_obj = Path(path)
    if not path_obj.exists():
        raise ValueError(f"The specified path does not exist: {path}")

    try:
        if path_obj.is_dir():
            target_path = path_obj
            if recursive:
                target_path = _find_best_dicom_series_dir(path_obj)
            loaded_image = _load_dicom_series(target_path, dataset_index, apply_rescale)
        elif path.lower().endswith((".nii", ".nii.gz")):
            loaded_image = _load_nifti(path, dataset_index)
        else:
            # Attempt to load as a single DICOM file if extension is not NIfTI
            # Check if it's a DICOM SEG file first
            if _is_dicom_seg(path):
                from pictologics.loaders.seg_loader import load_seg

                seg_result = load_seg(path, reference_image=reference_image)
                # load_seg can return dict when combine_segments=False, but here we use default
                if isinstance(seg_result, dict):
                    # Should not happen with default args, but handle gracefully
                    return next(iter(seg_result.values()))
                # Return early since reference alignment is handled by load_seg
                return seg_result

            try:
                loaded_image = _load_dicom_file(path, apply_rescale)
            except Exception:
                raise ValueError(
                    f"Unsupported file format or unable to read file: {path}"
                ) from None
    except Exception as e:
        # Re-raise ValueErrors directly, wrap others
        if isinstance(e, ValueError):
            raise e
        raise ValueError(f"Failed to load image from '{path}': {e}") from e

    # Apply repositioning if reference_image is provided and shapes differ
    if reference_image is not None:
        if loaded_image.array.shape != reference_image.array.shape:
            loaded_image = _position_in_reference(
                loaded_image, reference_image, fill_value, transpose_axes
            )

    return loaded_image

load_and_merge_images(image_paths, reference_image=None, conflict_resolution='max', dataset_index=0, recursive=False, binarize=None, reposition_to_reference=False, transpose_axes=None, fill_value=0.0, relabel_masks=False, apply_rescale=True)

Load multiple images (e.g., masks or partial scans) and merge them into a single image.

This function loads images from the provided paths, validates that they all share the same geometry (dimensions, spacing, origin, direction), and merges them according to the specified conflict resolution strategy.

Use Cases: - Merging multiple segmentation masks into a single ROI. - Merging split image volumes (though typically less common than mask merging). - Merging cropped/bounding-box segmentation masks (with reposition_to_reference=True).

Format & Path Support: Since this function uses load_image internally for each path, it supports: - NIfTI files (.nii, .nii.gz). - DICOM series (directories containing DICOM files). - Single DICOM files (with or without .dcm extension). - Nested directories (if paths point to folders containing DICOMs).

Parameters:

Name Type Description Default
image_paths list[str]

List of absolute or relative paths to the images. These can be file paths or directory paths.

required
reference_image Optional[Image]

An optional reference image (e.g., the scan corresponding to the masks). If provided, the merged image is validated against this image's geometry. Required when reposition_to_reference=True.

None
conflict_resolution str

Strategy to resolve voxel values when multiple images have non-zero values at the same location. Options: - 'max': Use the maximum value (default). - 'min': Use the minimum value. - 'first': Keep the value from the first image encountered (earlier in list). - 'last': Overwrite with the value from the last image encountered (later in list).

'max'
dataset_index int

For multi-volume datasets, specifies which volume to extract for all images (0-indexed). This works for:

  • 4D NIfTI files: Selects which time point/volume to load.
  • Multi-phase DICOM series: Selects which phase to load (e.g., cardiac phases, temporal positions, echo numbers). Use :func:pictologics.utilities.get_dicom_phases to discover available phases.

Defaults to 0 (the first volume/phase).

0
recursive bool

If True, recursively searches subdirectories for each path in image_paths. Defaults to False.

False
binarize bool | int | list[int] | tuple[int, int] | None

Rules for binarizing the merged image. - None (default): No binarization. - True: Sets all voxels > 0 to 1, others to 0. - int (e.g., 2): Sets voxels == value to 1, others to 0. - list[int] (e.g., [1, 2]): Sets voxels in list to 1, others to 0. - tuple[int, int] (e.g., (1, 10)): Sets voxels in inclusive range to 1, others to 0.

None
reposition_to_reference bool

If True and reference_image is provided, each loaded image will be repositioned into the reference coordinate space before merging. This is required when loading cropped segmentation masks that have different dimensions than the reference. Geometry validation is performed AFTER repositioning. Defaults to False.

False
transpose_axes tuple[int, int, int] | None

Axis transposition to apply when repositioning. E.g., (0, 2, 1) swaps Y and Z axes. Only used when reposition_to_reference=True.

None
fill_value float

Fill value for regions outside cropped masks when repositioning (default: 0.0). Only used when reposition_to_reference=True.

0.0
relabel_masks bool

If True, assigns unique label values (1, 2, 3, ...) to each mask file based on its order in image_paths. This converts binary [0,1] masks into multi-label masks where each file gets a distinct label, useful for visualization with different colors. Label assignment respects the order of image_paths. Defaults to False.

False
apply_rescale bool

If True (default), apply RescaleSlope and RescaleIntercept transformation for DICOM files to convert stored pixel values to real-world values (e.g., Hounsfield Units for CT). Set to False if you need raw stored values.

True
Note

The binarize parameter is intended for mask filtering (e.g., selecting specific ROI labels). To filter image intensity values (e.g., HU ranges), use the preprocessing steps in the radiomics pipeline configuration instead.

Returns:

Name Type Description
Image Image

A new Image object containing the merged data.

Raises:

Type Description
ValueError

If image_paths is empty, if an invalid conflict_resolution is provided, if reposition_to_reference=True but reference_image is not provided, or if the images (or reference) have mismatched geometries.

Example

Merging cropped segmentation masks:

main_img = load_image("ct_scan/", recursive=True)
seg_paths = [str(f) for f in Path("masks/").glob("*.dcm")]

merged = load_and_merge_images(
    seg_paths,
    reference_image=main_img,
    reposition_to_reference=True,
    conflict_resolution="max",
)

Source code in pictologics/loader.py
def load_and_merge_images(
    image_paths: list[str],
    reference_image: Optional[Image] = None,
    conflict_resolution: str = "max",
    dataset_index: int = 0,
    recursive: bool = False,
    binarize: bool | int | list[int] | tuple[int, int] | None = None,
    reposition_to_reference: bool = False,
    transpose_axes: tuple[int, int, int] | None = None,
    fill_value: float = 0.0,
    relabel_masks: bool = False,
    apply_rescale: bool = True,
) -> Image:
    """
    Load multiple images (e.g., masks or partial scans) and merge them into a single image.

    This function loads images from the provided paths, validates that they all share
    the same geometry (dimensions, spacing, origin, direction), and merges them
    according to the specified conflict resolution strategy.

    **Use Cases:**
    - Merging multiple segmentation masks into a single ROI.
    - Merging split image volumes (though typically less common than mask merging).
    - Merging cropped/bounding-box segmentation masks (with `reposition_to_reference=True`).

    **Format & Path Support:**
    Since this function uses `load_image` internally for each path, it supports:
    - **NIfTI files** (.nii, .nii.gz).
    - **DICOM series** (directories containing DICOM files).
    - **Single DICOM files** (with or without .dcm extension).
    - **Nested directories** (if paths point to folders containing DICOMs).

    Args:
        image_paths (list[str]): List of absolute or relative paths to the images.
            These can be file paths or directory paths.
        reference_image (Optional[Image]): An optional reference image (e.g., the scan
            corresponding to the masks). If provided, the merged image is validated
            against this image's geometry. Required when `reposition_to_reference=True`.
        conflict_resolution (str): Strategy to resolve voxel values when multiple images
            have non-zero values at the same location. Options:
            - 'max': Use the maximum value (default).
            - 'min': Use the minimum value.
            - 'first': Keep the value from the first image encountered (earlier in list).
            - 'last': Overwrite with the value from the last image encountered (later in list).
        dataset_index (int, optional): For multi-volume datasets, specifies which
            volume to extract for all images (0-indexed). This works for:

            - **4D NIfTI files**: Selects which time point/volume to load.
            - **Multi-phase DICOM series**: Selects which phase to load (e.g., cardiac
              phases, temporal positions, echo numbers). Use
              :func:`pictologics.utilities.get_dicom_phases` to discover available phases.

            Defaults to 0 (the first volume/phase).
        recursive (bool, optional): If True, recursively searches subdirectories
            for each path in `image_paths`. Defaults to False.
        binarize (bool | int | list[int] | tuple[int, int] | None, optional):
            Rules for binarizing the merged image.
            - `None` (default): No binarization.
            - `True`: Sets all voxels > 0 to 1, others to 0.
            - `int` (e.g., 2): Sets voxels == value to 1, others to 0.
            - `list[int]` (e.g., [1, 2]): Sets voxels in list to 1, others to 0.
            - `tuple[int, int]` (e.g., (1, 10)): Sets voxels in inclusive range to 1, others to 0.
        reposition_to_reference (bool): If True and reference_image is provided,
            each loaded image will be repositioned into the reference coordinate
            space before merging. This is required when loading cropped segmentation
            masks that have different dimensions than the reference. Geometry validation
            is performed AFTER repositioning. Defaults to False.
        transpose_axes (tuple[int, int, int] | None): Axis transposition to apply
            when repositioning. E.g., (0, 2, 1) swaps Y and Z axes.
            Only used when `reposition_to_reference=True`.
        fill_value (float): Fill value for regions outside cropped masks when
            repositioning (default: 0.0). Only used when `reposition_to_reference=True`.
        relabel_masks (bool): If True, assigns unique label values (1, 2, 3, ...)
            to each mask file based on its order in `image_paths`. This converts
            binary [0,1] masks into multi-label masks where each file gets a
            distinct label, useful for visualization with different colors.
            Label assignment respects the order of `image_paths`. Defaults to False.
        apply_rescale (bool): If True (default), apply RescaleSlope and RescaleIntercept
            transformation for DICOM files to convert stored pixel values to real-world
            values (e.g., Hounsfield Units for CT). Set to False if you need raw stored values.

    Note:
        The `binarize` parameter is intended for **mask filtering** (e.g., selecting specific ROI labels).
        To filter image intensity values (e.g., HU ranges), use the preprocessing steps in the
        radiomics pipeline configuration instead.

    Returns:
        Image: A new `Image` object containing the merged data.

    Raises:
        ValueError: If `image_paths` is empty, if an invalid `conflict_resolution` is provided,
            if `reposition_to_reference=True` but `reference_image` is not provided,
            or if the images (or reference) have mismatched geometries.

    Example:
        **Merging cropped segmentation masks:**
        ```python
        main_img = load_image("ct_scan/", recursive=True)
        seg_paths = [str(f) for f in Path("masks/").glob("*.dcm")]

        merged = load_and_merge_images(
            seg_paths,
            reference_image=main_img,
            reposition_to_reference=True,
            conflict_resolution="max",
        )
        ```
    """
    if not image_paths:
        raise ValueError("image_paths cannot be empty.")

    valid_strategies = {"max", "min", "first", "last"}
    if conflict_resolution not in valid_strategies:
        raise ValueError(
            f"Invalid conflict_resolution '{conflict_resolution}'. "
            f"Must be one of {valid_strategies}."
        )

    if reposition_to_reference and reference_image is None:
        raise ValueError(
            "reference_image must be provided when reposition_to_reference=True."
        )

    # Geometry validation helper
    def _validate_geometry(target: Image, ref: Image, name: str, ref_name: str) -> None:
        if target.array.shape != ref.array.shape:
            raise ValueError(
                f"Dimension mismatch between {name} {target.array.shape} "
                f"and {ref_name} {ref.array.shape}."
            )
        if not np.allclose(target.spacing, ref.spacing, atol=1e-5):
            raise ValueError(
                f"Spacing mismatch between {name} {target.spacing} "
                f"and {ref_name} {ref.spacing}."
            )
        if not np.allclose(target.origin, ref.origin, atol=1e-5):
            raise ValueError(
                f"Origin mismatch between {name} {target.origin} "
                f"and {ref_name} {ref.origin}."
            )
        if target.direction is not None and ref.direction is not None:
            if not np.allclose(target.direction, ref.direction, atol=1e-5):
                raise ValueError(f"Direction mismatch between {name} and {ref_name}.")

    if reposition_to_reference:
        # Mode: Reposition each image to reference space, then merge
        assert reference_image is not None  # Already validated above

        # Initialize merged array with reference geometry
        merged_array = np.full(
            reference_image.array.shape, fill_value, dtype=np.float64
        )

        for i, path in enumerate(image_paths):
            try:
                current_image = load_image(
                    path,
                    dataset_index=dataset_index,
                    recursive=recursive,
                    apply_rescale=apply_rescale,
                )
            except Exception as e:
                raise ValueError(f"Failed to load image '{path}': {e}") from e

            # Reposition to reference space
            repositioned = _position_in_reference(
                current_image, reference_image, fill_value, transpose_axes
            )

            # Validate geometry after repositioning
            _validate_geometry(
                repositioned,
                reference_image,
                f"repositioned image '{path}'",
                "reference image",
            )

            current_array = repositioned.array

            # Apply relabeling: replace all non-zero values with mask index + 1
            if relabel_masks:
                label_value = i + 1  # 1-indexed labels
                current_array = np.where(
                    current_array != fill_value, label_value, fill_value
                )

            # Merge with conflict resolution
            if i == 0:
                # First image: just copy non-fill values
                non_fill_mask = current_array != fill_value
                merged_array[non_fill_mask] = current_array[non_fill_mask]
            else:
                # Subsequent images: apply conflict resolution
                # Overlap: non-fill in both
                overlap_mask = (merged_array != fill_value) & (
                    current_array != fill_value
                )
                # New data: fill in merged, non-fill in current
                new_data_mask = (merged_array == fill_value) & (
                    current_array != fill_value
                )

                # Apply new data
                merged_array[new_data_mask] = current_array[new_data_mask]

                # Resolve conflicts
                if np.any(overlap_mask):
                    if conflict_resolution == "max":
                        merged_array[overlap_mask] = np.maximum(
                            merged_array[overlap_mask], current_array[overlap_mask]
                        )
                    elif conflict_resolution == "min":
                        merged_array[overlap_mask] = np.minimum(
                            merged_array[overlap_mask], current_array[overlap_mask]
                        )
                    elif conflict_resolution == "last":
                        merged_array[overlap_mask] = current_array[overlap_mask]
                    elif conflict_resolution == "first":
                        pass  # Keep existing values

        # Use reference geometry for output
        consensus_spacing = reference_image.spacing
        consensus_origin = reference_image.origin
        consensus_direction = reference_image.direction

    else:
        # Mode: Standard merging with strict geometry validation
        # Load the first image to serve as the consensus geometry
        try:
            consensus_image = load_image(
                image_paths[0],
                dataset_index=dataset_index,
                recursive=recursive,
                apply_rescale=apply_rescale,
            )
        except Exception as e:
            raise ValueError(
                f"Failed to load first image '{image_paths[0]}': {e}"
            ) from e

        merged_array = consensus_image.array.astype(np.float64)

        # Apply relabeling for the first image
        if relabel_masks:
            merged_array = np.where(merged_array != 0, 1, 0).astype(merged_array.dtype)

        # Iterate through remaining images
        for idx, path in enumerate(image_paths[1:], start=2):
            try:
                current_image = load_image(
                    path,
                    dataset_index=dataset_index,
                    recursive=recursive,
                    apply_rescale=apply_rescale,
                )
            except Exception as e:
                raise ValueError(f"Failed to load image '{path}': {e}") from e

            _validate_geometry(
                current_image, consensus_image, f"image '{path}'", "consensus image"
            )

            current_array = current_image.array

            # Apply relabeling: replace all non-zero values with mask index
            if relabel_masks:
                label_value = idx  # idx starts at 2 for second file
                current_array = np.where(current_array != 0, label_value, 0).astype(
                    current_array.dtype
                )

            # Identify regions
            overlap_mask = (merged_array != 0) & (current_array != 0)
            new_data_mask = (merged_array == 0) & (current_array != 0)

            # Apply new data
            merged_array[new_data_mask] = current_array[new_data_mask]

            # Resolve conflicts
            if np.any(overlap_mask):
                if conflict_resolution == "max":
                    merged_array[overlap_mask] = np.maximum(
                        merged_array[overlap_mask], current_array[overlap_mask]
                    )
                elif conflict_resolution == "min":
                    merged_array[overlap_mask] = np.minimum(
                        merged_array[overlap_mask], current_array[overlap_mask]
                    )
                elif conflict_resolution == "last":
                    merged_array[overlap_mask] = current_array[overlap_mask]
                elif conflict_resolution == "first":
                    pass  # Already have the 'first' value

        consensus_spacing = consensus_image.spacing
        consensus_origin = consensus_image.origin
        consensus_direction = consensus_image.direction

        # Validate against reference image if provided (for non-reposition mode)
        if reference_image is not None:
            final_merged_image = Image(
                array=merged_array,
                spacing=consensus_spacing,
                origin=consensus_origin,
                direction=consensus_direction,
                modality="Image",
            )
            _validate_geometry(
                final_merged_image, reference_image, "merged image", "reference image"
            )

    # Apply binarization if requested
    if binarize is not None:
        mask_out: npt.NDArray[np.floating[Any]] = np.zeros_like(
            merged_array, dtype=np.uint8
        )
        if isinstance(binarize, bool) and binarize is True:
            mask_out[merged_array > 0] = 1
        elif isinstance(binarize, int) and not isinstance(binarize, bool):
            mask_out[merged_array == binarize] = 1
        elif isinstance(binarize, list):
            mask_out[np.isin(merged_array, binarize)] = 1
        elif isinstance(binarize, tuple) and len(binarize) == 2:
            mask_out[(merged_array >= binarize[0]) & (merged_array <= binarize[1])] = 1
        else:
            if binarize is not False:
                raise ValueError(f"Unsupported binarize value: {binarize}")
            mask_out = merged_array

        if binarize is not False:
            merged_array = mask_out.astype(np.float64)

    return Image(
        array=merged_array,
        spacing=consensus_spacing,
        origin=consensus_origin,
        direction=consensus_direction,
        modality="MergedImage",
    )

create_full_mask(reference_image, dtype=np.uint8)

Create a whole-image ROI mask matching a reference image.

This utility is primarily used when a user does not provide a segmentation mask. The returned mask has the same geometry (shape, spacing, origin, direction) as the reference image and contains a value of 1 for every voxel.

Parameters:

Name Type Description Default
reference_image Image

Image whose geometry should be copied.

required
dtype DTypeLike

Numpy dtype to use for the mask array. Defaults to np.uint8.

uint8

Returns:

Type Description
Image

An Image mask with array == 1 everywhere.

Raises:

Type Description
ValueError

If the reference image does not have a valid 3D array.

Source code in pictologics/loader.py
def create_full_mask(reference_image: Image, dtype: DTypeLike = np.uint8) -> Image:
    """Create a whole-image ROI mask matching a reference image.

    This utility is primarily used when a user does not provide a segmentation mask.
    The returned mask has the same geometry (shape, spacing, origin, direction) as
    the reference image and contains a value of 1 for every voxel.

    Args:
        reference_image: Image whose geometry should be copied.
        dtype: Numpy dtype to use for the mask array. Defaults to `np.uint8`.

    Returns:
        An `Image` mask with `array == 1` everywhere.

    Raises:
        ValueError: If the reference image does not have a valid 3D array.
    """
    if reference_image.array.ndim != 3:
        raise ValueError(
            f"reference_image.array must be 3D, got shape {reference_image.array.shape}"
        )

    mask_array = np.ones(reference_image.array.shape, dtype=dtype)
    return Image(
        array=mask_array,
        spacing=reference_image.spacing,
        origin=reference_image.origin,
        direction=reference_image.direction,
        modality="mask",
    )

pictologics.loaders.seg_loader

DICOM Segmentation (SEG) Loader

This module provides functionality for loading DICOM Segmentation objects as pictologics Image instances. SEG files are specialized DICOM objects that store segmentation masks with multi-segment support.

Uses highdicom for robust SEG parsing and extraction.

load_seg(path, segment_numbers=None, combine_segments=True, reference_image=None)

Load a DICOM SEG file as a mask Image.

This function loads a DICOM Segmentation object and converts it to the standard pictologics Image format. The resulting Image has the same structure as images returned by load_image():

  • array: npt.NDArray[np.floating[Any]] with shape (X, Y, Z)
  • spacing: tuple[float, float, float] in mm
  • origin: tuple[float, float, float] in mm
  • direction: Optional[npt.NDArray[np.floating[Any]]] - 3x3 direction cosines
  • modality: str - set to "SEG"

Parameters:

Name Type Description Default
path str | Path

Path to the DICOM SEG file.

required
segment_numbers list[int] | None

Specific segment numbers to extract. If None, all segments are extracted. Segment numbers are 1-indexed as per DICOM convention.

None
combine_segments bool

Controls how segments are returned:

  • True (default): Returns a single Image where each segment is encoded as its segment number (1, 2, 3...) in the voxel values. Background voxels are 0. This is useful when you want a single label map for visualization or when segments are mutually exclusive (e.g., organ segmentation where each voxel belongs to one structure).

  • False: Returns a dict mapping segment numbers to individual binary Image masks. Each mask contains only 0s and 1s. This is useful when:

  • Segments may overlap (e.g., nested structures like tumor within organ)

  • You need to process each segment independently (e.g., extract radiomics from each segment separately)
  • You want to select specific segments for different analyses
True
reference_image 'Image | None'

Optional reference Image for geometry alignment. When provided, the output mask will be resampled/repositioned to match the reference geometry.

None

Returns:

Type Description
'Image | dict[int, Image]'

If combine_segments is True: A single Image with segment labels.

'Image | dict[int, Image]'

If combine_segments is False: A dict of {segment_number: Image}.

Raises:

Type Description
ValueError

If the file is not a valid DICOM SEG object.

FileNotFoundError

If the file does not exist.

Example

Load a SEG file with all segments combined (label map):

from pictologics.loaders import load_seg
import numpy as np

mask = load_seg("segmentation.dcm")
print(mask.array.shape)  # (X, Y, Z)
print(np.unique(mask.array))  # [0, 1, 2, ...]

Load specific segments as separate binary masks:

masks = load_seg("segmentation.dcm", segment_numbers=[1, 2], combine_segments=False)
for seg_num, mask in masks.items():
    print(f"Segment {seg_num}: {mask.array.sum()} voxels")

Align mask to a reference CT image:

from pictologics import load_image

ct = load_image("ct_scan/")
mask = load_seg("segmentation.dcm", reference_image=ct)
assert mask.array.shape == ct.array.shape
Source code in pictologics/loaders/seg_loader.py
def load_seg(
    path: str | Path,
    segment_numbers: list[int] | None = None,
    combine_segments: bool = True,
    reference_image: "Image | None" = None,
) -> "Image | dict[int, Image]":
    """Load a DICOM SEG file as a mask Image.

    This function loads a DICOM Segmentation object and converts it to
    the standard pictologics Image format. The resulting Image has the
    same structure as images returned by load_image():

    - array: npt.NDArray[np.floating[Any]] with shape (X, Y, Z)
    - spacing: tuple[float, float, float] in mm
    - origin: tuple[float, float, float] in mm
    - direction: Optional[npt.NDArray[np.floating[Any]]] - 3x3 direction cosines
    - modality: str - set to "SEG"

    Args:
        path: Path to the DICOM SEG file.
        segment_numbers: Specific segment numbers to extract. If None, all
            segments are extracted. Segment numbers are 1-indexed as per
            DICOM convention.
        combine_segments: Controls how segments are returned:

            - **True (default)**: Returns a single Image where each segment
              is encoded as its segment number (1, 2, 3...) in the voxel values.
              Background voxels are 0. This is useful when you want a single
              label map for visualization or when segments are mutually exclusive
              (e.g., organ segmentation where each voxel belongs to one structure).

            - **False**: Returns a dict mapping segment numbers to individual
              binary Image masks. Each mask contains only 0s and 1s. This is
              useful when:

              - Segments may overlap (e.g., nested structures like tumor
                within organ)
              - You need to process each segment independently (e.g., extract
                radiomics from each segment separately)
              - You want to select specific segments for different analyses

        reference_image: Optional reference Image for geometry alignment.
            When provided, the output mask will be resampled/repositioned
            to match the reference geometry.

    Returns:
        If combine_segments is True: A single Image with segment labels.
        If combine_segments is False: A dict of {segment_number: Image}.

    Raises:
        ValueError: If the file is not a valid DICOM SEG object.
        FileNotFoundError: If the file does not exist.

    Example:
        Load a SEG file with all segments combined (label map):

        ```python
        from pictologics.loaders import load_seg
        import numpy as np

        mask = load_seg("segmentation.dcm")
        print(mask.array.shape)  # (X, Y, Z)
        print(np.unique(mask.array))  # [0, 1, 2, ...]
        ```

        Load specific segments as separate binary masks:

        ```python
        masks = load_seg("segmentation.dcm", segment_numbers=[1, 2], combine_segments=False)
        for seg_num, mask in masks.items():
            print(f"Segment {seg_num}: {mask.array.sum()} voxels")
        ```

        Align mask to a reference CT image:

        ```python
        from pictologics import load_image

        ct = load_image("ct_scan/")
        mask = load_seg("segmentation.dcm", reference_image=ct)
        assert mask.array.shape == ct.array.shape
        ```
    """
    import highdicom as hd

    from pictologics.loader import Image

    path_obj = Path(path)
    if not path_obj.exists():
        raise FileNotFoundError(f"SEG file not found: {path}")

    # Load the DICOM SEG using highdicom
    try:
        seg = hd.seg.segread(str(path_obj))
    except Exception as e:
        raise ValueError(f"Failed to load DICOM SEG file: {e}") from e

    # Verify it's a SEG object
    if not hasattr(seg, "SegmentSequence"):
        raise ValueError(f"File is not a valid DICOM SEG object: {path}")

    # Get available segment numbers
    available_segments = [s.SegmentNumber for s in seg.SegmentSequence]

    # Determine which segments to extract
    if segment_numbers is None:
        target_segments = available_segments
    else:
        # Validate requested segments exist
        for seg_num in segment_numbers:
            if seg_num not in available_segments:
                raise ValueError(
                    f"Segment {seg_num} not found. "
                    f"Available segments: {available_segments}"
                )
        target_segments = segment_numbers

    # Extract geometry information from the SEG
    spacing, origin, direction = _extract_seg_geometry(seg)

    # Extract pixel array - shape is typically (frames, rows, cols)
    pixel_array = seg.pixel_array

    # Get the number of segments and frames
    n_frames = pixel_array.shape[0] if pixel_array.ndim == 3 else 1

    if combine_segments:
        # Create combined label image
        combined_array = _extract_combined_segments(
            seg, pixel_array, target_segments, n_frames
        )

        # Reorder axes from (Z, Y, X) or (frames, rows, cols) to (X, Y, Z)
        combined_array = np.transpose(combined_array, (2, 1, 0))

        result = Image(
            array=combined_array,
            spacing=spacing,
            origin=origin,
            direction=direction,
            modality="SEG",
        )

        # Align to reference if provided
        if reference_image is not None:
            result = _align_to_reference(result, reference_image)

        return result
    else:
        # Return dict of individual segment masks
        result_dict: dict[int, Image] = {}

        for seg_num in target_segments:
            mask_array = _extract_single_segment(seg, pixel_array, seg_num, n_frames)

            # Reorder axes from (Z, Y, X) to (X, Y, Z)
            mask_array = np.transpose(mask_array, (2, 1, 0))

            mask_image = Image(
                array=mask_array.astype(np.uint8),
                spacing=spacing,
                origin=origin,
                direction=direction,
                modality="SEG",
            )

            # Align to reference if provided
            if reference_image is not None:
                mask_image = _align_to_reference(mask_image, reference_image)

            result_dict[seg_num] = mask_image

        return result_dict

get_segment_info(path)

Get information about segments in a DICOM SEG file.

Parameters:

Name Type Description Default
path str | Path

Path to the DICOM SEG file.

required

Returns:

Type Description
list[dict[str, str | int]]

List of dicts with segment information:

list[dict[str, str | int]]
  • segment_number: int
list[dict[str, str | int]]
  • segment_label: str
list[dict[str, str | int]]
  • segment_description: str (if available)
list[dict[str, str | int]]
  • algorithm_type: str (if available)

Raises:

Type Description
ValueError

If the file is not a valid DICOM SEG object.

Source code in pictologics/loaders/seg_loader.py
def get_segment_info(path: str | Path) -> list[dict[str, str | int]]:
    """Get information about segments in a DICOM SEG file.

    Args:
        path: Path to the DICOM SEG file.

    Returns:
        List of dicts with segment information:
        - segment_number: int
        - segment_label: str
        - segment_description: str (if available)
        - algorithm_type: str (if available)

    Raises:
        ValueError: If the file is not a valid DICOM SEG object.
    """
    import highdicom as hd

    path_obj = Path(path)
    if not path_obj.exists():
        raise FileNotFoundError(f"SEG file not found: {path}")

    try:
        seg = hd.seg.segread(str(path_obj))
    except Exception as e:
        raise ValueError(f"Failed to load DICOM SEG file: {e}") from e

    if not hasattr(seg, "SegmentSequence"):
        raise ValueError(f"File is not a valid DICOM SEG object: {path}")

    segments = []
    for segment in seg.SegmentSequence:
        info: dict[str, str | int] = {
            "segment_number": segment.SegmentNumber,
            "segment_label": getattr(segment, "SegmentLabel", ""),
        }

        if hasattr(segment, "SegmentDescription"):
            info["segment_description"] = segment.SegmentDescription

        if hasattr(segment, "SegmentAlgorithmType"):
            info["algorithm_type"] = segment.SegmentAlgorithmType

        segments.append(info)

    return segments