Skip to content

Dataset

continuiti.data.dataset

Data sets in continuiti. Every data set is a list of (x, u, y, v) tuples.

OperatorDatasetBase

Bases: Dataset, ABC

Abstract base class of a dataset for operator training.

__len__() abstractmethod

Return the number of samples.

RETURNS DESCRIPTION
int

number of samples in the entire set.

Source code in src/continuiti/data/dataset.py
@abstractmethod
def __len__(self) -> int:
    """Return the number of samples.

    Returns:
        number of samples in the entire set.
    """

__getitem__(idx) abstractmethod

Retrieves the input-output pair at the specified index and applies transformations.

PARAMETER DESCRIPTION
-

The index of the sample to retrieve.

TYPE: idx

RETURNS DESCRIPTION
Tuple[Tensor, Tensor, Tensor, Tensor]

A tuple containing the three input tensors and the output tensor for the given index.

Source code in src/continuiti/data/dataset.py
@abstractmethod
def __getitem__(
    self, idx
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
    """Retrieves the input-output pair at the specified index and applies transformations.

    Parameters:
        - idx: The index of the sample to retrieve.

    Returns:
        A tuple containing the three input tensors and the output tensor for the given index.
    """

OperatorDataset(x, u, y, v, x_transform=None, u_transform=None, y_transform=None, v_transform=None)

Bases: OperatorDatasetBase

A dataset for operator training.

In operator training, at least one function is mapped onto a second one. To fulfill the properties discretization invariance, domain independence and learn operators with physics-based loss access to at least four different discretized spaces is necessary. One on which the input is sampled (x), the input function sampled on these points (u), the discretization of the output space (y), and the output of the operator (v) sampled on these points. Not all loss functions and/or operators need access to all of these attributes.

PARAMETER DESCRIPTION
x

Tensor of shape (num_observations, x_dim, num_sensors...) with sensor positions.

TYPE: Tensor

u

Tensor of shape (num_observations, u_dim, num_sensors...) with evaluations of the input functions at sensor positions.

TYPE: Tensor

y

Tensor of shape (num_observations, y_dim, num_evaluations...) with evaluation positions.

TYPE: Tensor

v

Tensor of shape (num_observations, v_dim, num_evaluations...) with ground truth operator mappings.

TYPE: Tensor

ATTRIBUTE DESCRIPTION
shapes

Shape of all tensors.

transform

Transformations for each tensor.

Source code in src/continuiti/data/dataset.py
def __init__(
    self,
    x: torch.Tensor,
    u: torch.Tensor,
    y: torch.Tensor,
    v: torch.Tensor,
    x_transform: Optional[Transform] = None,
    u_transform: Optional[Transform] = None,
    y_transform: Optional[Transform] = None,
    v_transform: Optional[Transform] = None,
):
    assert all([t.ndim >= 3 for t in [x, u, y, v]]), "Wrong number of dimensions."
    assert (
        x.size(0) == u.size(0) == y.size(0) == v.size(0)
    ), "Inconsistent number of observations."

    # get dimensions and sizes
    x_dim, x_size = x.size(1), x.size()[2:]
    u_dim, u_size = u.size(1), u.size()[2:]
    y_dim, y_size = y.size(1), y.size()[2:]
    v_dim, v_size = v.size(1), v.size()[2:]

    assert x_size == u_size, "Inconsistent number of sensors."
    assert y_size == v_size, "Inconsistent number of evaluations."

    super().__init__()

    self.x = x
    self.u = u
    self.y = y
    self.v = v

    # used to initialize architectures
    self.shapes = OperatorShapes(
        x=TensorShape(dim=x_dim, size=x_size),
        u=TensorShape(dim=u_dim, size=u_size),
        y=TensorShape(dim=y_dim, size=y_size),
        v=TensorShape(dim=v_dim, size=v_size),
    )

    self.transform = {
        dim: tf
        for dim, tf in [
            ("x", x_transform),
            ("u", u_transform),
            ("y", y_transform),
            ("v", v_transform),
        ]
        if tf is not None
    }

__len__()

Return the number of samples.

RETURNS DESCRIPTION
int

Number of samples in the entire set.

Source code in src/continuiti/data/dataset.py
def __len__(self) -> int:
    """Return the number of samples.

    Returns:
        Number of samples in the entire set.
    """
    return self.x.size(0)

__getitem__(idx)

Retrieves the input-output pair at the specified index and applies transformations.

PARAMETER DESCRIPTION
idx

The index of the sample to retrieve.

TYPE: int

RETURNS DESCRIPTION
Tuple[Tensor, Tensor, Tensor, Tensor]

A tuple containing the three input tensors and the output tensor for the given index.

Source code in src/continuiti/data/dataset.py
def __getitem__(
    self,
    idx: int,
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
    """Retrieves the input-output pair at the specified index and applies transformations.

    Parameters:
        idx: The index of the sample to retrieve.

    Returns:
        A tuple containing the three input tensors and the output tensor for the given index.
    """
    return self._apply_transformations(
        self.x[idx], self.u[idx], self.y[idx], self.v[idx]
    )

Last update: 2024-08-20
Created: 2024-08-20