Skip to content

Transform

The charucal.transform module provides the classes responsible for transforming live camera frames into a top-down view.


Overview

CameraTransformer is the main class you interact with at runtime. It loads a saved calibration, precomputes remap tables once during initialization, and then applies them on every call to transform().

import cv2
from charucal import CalibrationResult, CameraTransformer, RenderDistance

calibration = CalibrationResult.load("calibration.json")

transformer = CameraTransformer(
    calibration,
    render_distance=RenderDistance(front=10.0, lateral=5.0),
    output_shape=(1000, 1000),
    interpolation=cv2.INTER_LINEAR,  # or cv2.INTER_NEAREST for maximum speed
)

topdown = transformer.transform(frames)  # list[np.ndarray] -> np.ndarray (H x W x 3 BGR)

Controlling the render area

RenderDistance defines the physical extent of the scene shown in the output image.

  • front — meters from the camera position to the top edge of the output
  • lateral — meters from the center line to the left and right edges of the output

The pixels-per-meter scale is chosen automatically to fill output_shape while preserving the aspect ratio implied by front and lateral.


Measuring distances in the output

CameraTransformer provides helpers to convert between output pixel coordinates and meters:

# Forward distance to any output pixel
dist_m = transformer.get_distance_forward(y=300)

# Euclidean distance to any output pixel
dist_m = transformer.get_distance_to_point(x=500, y=200)

# Map a point from a source camera into the output image
out_x, out_y = transformer.transform_point(x=640, y=360, camera_idx=0)

# Distance to a point in a source camera image
dist_m = transformer.get_distance_to_camera_point(x=640, y=360, camera_idx=0)

Parallelism

By default (n_jobs=-1), all CPU threads are used to remap cameras in parallel. Set n_jobs=1 to disable threading. This can reduce latency when the number of cameras is small and the output resolution is large, since the overhead of threading can outweigh the benefits in that case.


Reference

RenderDistance dataclass

RenderDistance(front: float, lateral: float)

The physical extent of the scene captured in the top-down image.

Methods:

Name Description
__post_init__

Initialize the render distance dataclass.

Attributes:

Name Type Description
front float

The distance in meters from the nearest visible road point to the far edge.

lateral float

The distance in meters from the center of the output image to the left and right edges.

front instance-attribute

front: float

The distance in meters from the nearest visible road point to the far edge.

lateral instance-attribute

lateral: float

The distance in meters from the center of the output image to the left and right edges.

__post_init__

__post_init__() -> None

Initialize the render distance dataclass.

Source code in src/charucal/transform.py
def __post_init__(self) -> None:
    """Initialize the render distance dataclass."""
    if self.front <= 0.0:
        raise ValueError(f"'front' must be positive; got {self.front}.")

    if self.lateral <= 0.0:
        raise ValueError(f"'lateral' must be positive; got {self.lateral}.")

CameraTransformer

CameraTransformer(
    calibration: CalibrationResult,
    *,
    render_distance: RenderDistance,
    output_shape: tuple[int, int],
    input_shapes: Sequence[tuple[int, int]] | None = None,
    interpolation: int = cv2.INTER_LINEAR,
    n_jobs: int = -1,
)

Transforms multi-camera images into a single top-down view.

Initialize the transformer and precompute all remap operations.

Parameters:

Name Type Description Default
calibration CalibrationResult

The result of a prior multi-camera calibration.

required
render_distance RenderDistance

The physical scene extents to render (in meters).

required
output_shape tuple[int, int]

The (width, height) of the output canvas in pixels.

required
input_shapes Sequence[tuple[int, int]] | None

The (width, height) of the input images. If None, the source shapes from the calibration are used.

None
interpolation int

The OpenCV interpolation flag passed to cv2.remap. Use cv2.INTER_LINEAR for better quality at the cost of speed, or cv2.INTER_NEAREST for maximum throughput at reduced quality. Defaults to cv2.INTER_LINEAR.

INTER_LINEAR
n_jobs int

The number of worker threads. -1 uses all available CPUs, 1 disables parallelism.

-1

Methods:

Name Description
get_distance

Convert a pixel distance in the output image to meters.

get_pixels

Convert a physical distance in meters to output pixels.

get_distance_forward

Return the forward distance in meters to a horizontal scanline.

get_distance_to_point

Return the Euclidean distance in meters from the camera to an output pixel.

get_distance_forward_to_camera_point

Return the forward (depth) distance in meters to a point in a camera image.

get_distance_to_camera_point

Return the Euclidean distance in meters from the camera to a point in a camera image.

transform

Transform a list of camera images into a single top-down view.

transform_point

Transform a point from a camera image into the top-down output image.

Attributes:

Name Type Description
output_shape tuple[int, int]

The (width, height) of the output image in pixels.

pixels_per_meter float

The number of output pixels corresponding to one meter in the scene.

Source code in src/charucal/transform.py
def __init__(
    self,
    calibration: CalibrationResult,
    *,
    render_distance: RenderDistance,
    output_shape: tuple[int, int],
    input_shapes: Sequence[tuple[int, int]] | None = None,
    interpolation: int = cv2.INTER_LINEAR,
    n_jobs: int = -1,
) -> None:
    """Initialize the transformer and precompute all remap operations.

    :param calibration: The result of a prior multi-camera calibration.
    :param render_distance: The physical scene extents to render (in meters).
    :param output_shape: The ``(width, height)`` of the output canvas in pixels.
    :param input_shapes: The ``(width, height)`` of the input images. If ``None``, the source shapes from
        the calibration are used.
    :param interpolation: The OpenCV interpolation flag passed to ``cv2.remap``. Use ``cv2.INTER_LINEAR``
        for better quality at the cost of speed, or ``cv2.INTER_NEAREST`` for maximum throughput at
        reduced quality. Defaults to ``cv2.INTER_LINEAR``.
    :param n_jobs: The number of worker threads. ``-1`` uses all available CPUs, ``1`` disables parallelism.
    """
    self._calibration = calibration
    self._executor = _build_executor(n_jobs)
    self._interpolation = interpolation

    self._input_shapes = list(input_shapes) if input_shapes is not None else list(calibration.source_shapes)
    if len(self._input_shapes) != calibration.num_cameras:
        raise ValueError(f"Expected {calibration.num_cameras} input shapes, got {len(self._input_shapes)}.")

    ref_to_output, self._pixels_per_meter, canvas_shape = self._build_ref_homography(
        calibration, render_distance, output_shape
    )
    self._warp_layers, self._camera_to_output, self._output_shape, self._canvas = self._build_warp_layers(
        calibration, ref_to_output, canvas_shape
    )

output_shape property

output_shape: tuple[int, int]

The (width, height) of the output image in pixels.

pixels_per_meter property

pixels_per_meter: float

The number of output pixels corresponding to one meter in the scene.

get_distance

get_distance(pixels: float) -> float

Convert a pixel distance in the output image to meters.

Parameters:

Name Type Description Default
pixels float

The distance in output pixels.

required

Returns:

Type Description
float

The equivalent distance in meters.

Source code in src/charucal/transform.py
def get_distance(self, pixels: float) -> float:
    """Convert a pixel distance in the output image to meters.

    :param pixels: The distance in output pixels.
    :return: The equivalent distance in meters.
    """
    return pixels / self._pixels_per_meter

get_pixels

get_pixels(meters: float) -> float

Convert a physical distance in meters to output pixels.

Parameters:

Name Type Description Default
meters float

The distance in meters.

required

Returns:

Type Description
float

The equivalent distance in output pixels.

Source code in src/charucal/transform.py
def get_pixels(self, meters: float) -> float:
    """Convert a physical distance in meters to output pixels.

    :param meters: The distance in meters.
    :return: The equivalent distance in output pixels.
    """
    return meters * self._pixels_per_meter

get_distance_forward

get_distance_forward(y: int) -> float

Return the forward distance in meters to a horizontal scanline.

Parameters:

Name Type Description Default
y int

The y-coordinate of the scanline in the output image.

required

Returns:

Type Description
float

The distance in meters.

Source code in src/charucal/transform.py
def get_distance_forward(self, y: int) -> float:
    """Return the forward distance in meters to a horizontal scanline.

    :param y: The y-coordinate of the scanline in the output image.
    :return: The distance in meters.
    """
    _, out_h = self._output_shape
    return self.get_distance(float(out_h - y))

get_distance_to_point

get_distance_to_point(x: int, y: int) -> float

Return the Euclidean distance in meters from the camera to an output pixel.

Parameters:

Name Type Description Default
x int

The x-coordinate of the output pixel.

required
y int

The y-coordinate of the output pixel.

required

Returns:

Type Description
float

The distance in meters.

Source code in src/charucal/transform.py
def get_distance_to_point(self, x: int, y: int) -> float:
    """Return the Euclidean distance in meters from the camera to an output pixel.

    :param x: The x-coordinate of the output pixel.
    :param y: The y-coordinate of the output pixel.
    :return: The distance in meters.
    """
    out_w, out_h = self._output_shape
    return self.get_distance(math.hypot(x - out_w / 2.0, y - float(out_h)))

get_distance_forward_to_camera_point

get_distance_forward_to_camera_point(
    x: float, y: float, camera_idx: int
) -> float

Return the forward (depth) distance in meters to a point in a camera image.

Only the projected y-coordinate in the output image is used, so lateral offset is ignored.

Parameters:

Name Type Description Default
x float

The x-coordinate in the source camera image.

required
y float

The y-coordinate in the source camera image.

required
camera_idx int

The index of the source camera in the calibration.

required

Returns:

Type Description
float

The forward distance in meters.

Source code in src/charucal/transform.py
def get_distance_forward_to_camera_point(self, x: float, y: float, camera_idx: int) -> float:
    """Return the forward (depth) distance in meters to a point in a camera image.

    Only the projected y-coordinate in the output image is used, so lateral offset is ignored.

    :param x: The x-coordinate in the source camera image.
    :param y: The y-coordinate in the source camera image.
    :param camera_idx: The index of the source camera in the calibration.
    :return: The forward distance in meters.
    """
    _, out_y = self.transform_point(x, y, camera_idx)
    return self.get_distance_forward(out_y)

get_distance_to_camera_point

get_distance_to_camera_point(
    x: float, y: float, camera_idx: int
) -> float

Return the Euclidean distance in meters from the camera to a point in a camera image.

Parameters:

Name Type Description Default
x float

The x-coordinate in the source camera image.

required
y float

The y-coordinate in the source camera image.

required
camera_idx int

The index of the source camera in the calibration.

required

Returns:

Type Description
float

The Euclidean distance in meters.

Source code in src/charucal/transform.py
def get_distance_to_camera_point(self, x: float, y: float, camera_idx: int) -> float:
    """Return the Euclidean distance in meters from the camera to a point in a camera image.

    :param x: The x-coordinate in the source camera image.
    :param y: The y-coordinate in the source camera image.
    :param camera_idx: The index of the source camera in the calibration.
    :return: The Euclidean distance in meters.
    """
    out_x, out_y = self.transform_point(x, y, camera_idx)
    return self.get_distance_to_point(out_x, out_y)

transform

transform(
    images: list[NDArray[uint8]],
) -> npt.NDArray[np.uint8]

Transform a list of camera images into a single top-down view.

Parameters:

Name Type Description Default
images list[NDArray[uint8]]

The camera images ordered to match the calibration.

required

Returns:

Type Description
NDArray[uint8]

The composited top-down image.

Source code in src/charucal/transform.py
def transform(self, images: list[npt.NDArray[np.uint8]]) -> npt.NDArray[np.uint8]:
    """Transform a list of camera images into a single top-down view.

    :param images: The camera images ordered to match the calibration.
    :return: The composited top-down image.
    """
    num_cameras = self._calibration.num_cameras
    if len(images) != num_cameras:
        raise ValueError(f"Expected {num_cameras} images, got {len(images)}.")

    self._canvas.fill(0)

    def _warp_camera(camera_idx: int) -> tuple[int, bool]:
        layer = self._warp_layers[camera_idx]
        if layer is None:
            return camera_idx, False

        _apply_remap(images[camera_idx], layer, self._interpolation)
        return camera_idx, True

    warp_results = (
        self._executor.map(_warp_camera, range(num_cameras))
        if self._executor is not None
        else map(_warp_camera, range(num_cameras))
    )

    for camera_idx, warped in warp_results:
        if not warped:
            continue

        layer = self._warp_layers[camera_idx]
        cv2.copyTo(layer.dst, layer.mask, self._canvas[layer.row_slice, layer.col_slice])

    return self._canvas

transform_point

transform_point(
    x: float, y: float, camera_idx: int
) -> tuple[int, int]

Transform a point from a camera image into the top-down output image.

Parameters:

Name Type Description Default
x float

The x-coordinate in the source camera image.

required
y float

The y-coordinate in the source camera image.

required
camera_idx int

The index of the source camera in the calibration.

required

Returns:

Type Description
tuple[int, int]

The (x, y) pixel coordinates in the top-down output image.

Source code in src/charucal/transform.py
def transform_point(self, x: float, y: float, camera_idx: int) -> tuple[int, int]:
    """Transform a point from a camera image into the top-down output image.

    :param x: The x-coordinate in the source camera image.
    :param y: The y-coordinate in the source camera image.
    :param camera_idx: The index of the source camera in the calibration.
    :return: The ``(x, y)`` pixel coordinates in the top-down output image.
    """
    homography = self._camera_to_output[camera_idx]

    source_point = np.array([x, y, 1.0], dtype=np.float64)
    output_point = homography @ source_point
    if abs(output_point[2]) < _EPS:
        raise RuntimeError(f"Camera {camera_idx} point ({x}, {y}) projects onto the output horizon.")

    out_x = output_point[0] / output_point[2]
    out_y = output_point[1] / output_point[2]

    return int(round(out_x)), int(round(out_y))