Training¶

Now, as you are familiar with operators and functions, let's learn an operator! In the following, we will learn the basics of training a neural operator in continuiti.

Operator¶

Given two sets of functions $U$ and $V$, assume we want to learn an operator

\[\begin{align*} G: U &\to V, \\ u &\mapsto v, \end{align*}\]

that maps functions $u \in U$ to functions $v \in V$.

In this example, we choose to learn the operator that maps the set of functions

\[ U = \{ u_a(x) = \sin(a \pi x) \mid a \in [1, 2] \} \]

to the set of functions

\[ V = \{ v_a(x) = a \pi \cos(a \pi y) \mid a \in [1, 2] \}, \]

such that $G(u_a) = v_a$. Let's consider $x, y \in [0, 1]$ and start with the visualization of some functions in $U$ and $V$.

import torch
from continuiti.discrete import RegularGridSampler
from continuiti.data.function import FunctionSet

U = FunctionSet(lambda a: lambda x: torch.sin(a * torch.pi * x))
V = FunctionSet(lambda a: lambda y: a * torch.pi * torch.cos(a * torch.pi * y))

a = torch.Tensor([[1., 1.5, 2.]])

u_a = U(a)
v_a = V(a)

print(f"len(u) = {len(u_a)}  ", f"len(v) = {len(v_a)}")

len(u) = 3   len(v) = 3

No description has been provided for this image

Note

In these examples, we hide the code for visualization, but you can find it in the source code of this notebook.

Discretization¶

Operator learning is about learning mappings between infinite dimensional spaces. To work with infinite-dimensional objects numerically, we have to discretize the input and output function somehow. In continuiti, this is done by point-wise evaluation.

Discretized functions can be collected in an OperatorDataset for operator learning. The OperatorDataset is a container of discretizations of input-output functions. It contains tuples (x, u, y, v) of tensors, where every sample consists of

the sensor positions x,
the values u of the input function at the sensor positions,
the evaluation points y, and
the values v of the output functions at the evaluation points.

If we already have a FunctionSet, we can use the FunctionOperatorDataset to sample elements $u \in U$ and $v \in V$ evaluated at sampled positions.

from continuiti.data.function import FunctionOperatorDataset

a_sampler = RegularGridSampler([1.], [2.])
x_sampler = RegularGridSampler([0.], [1.])

n_sensors = 32
n_observations = 128

dataset = FunctionOperatorDataset(
    U, x_sampler, n_sensors,
    V, x_sampler, n_sensors,
    a_sampler, n_observations,
)

OperatorShapes(x=TensorShape(dim=1, size=torch.Size([32])), u=TensorShape(dim=1, size=torch.Size([32])), y=TensorShape(dim=1, size=torch.Size([32])), v=TensorShape(dim=1, size=torch.Size([32])))

Split data set into training, validation and test set.

from continuiti.data.utility import split

train_dataset, test_val_dataset = split(dataset, 0.75)
val_dataset, test_dataset = split(test_val_dataset, 0.5)

len(train_dataset) = 96  len(val_dataset) = 16  len(test_dataset) = 16

Neural Operator¶

In order to learn the operator $G$ with a neural network, we can train a neural operator.

A neural operator $G_\theta$ takes an input function $u$, evaluated at sensor positions $x$, and maps it to a function $v$ evaluated at (possibly different) evaluation points $y$, such that $$ v(y) = G(u)(y) \approx G_\theta\left(x, u(x), y\right). $$

In this example, we train a DeepONet, a common neural operator architecture motivated by the universal approximation theorem for operators.

from continuiti.operators import DeepONet
operator = DeepONet(shapes=dataset.shapes, trunk_depth=8)

Training¶

continuiti provides the Trainer class which implements a default training loop for neural operators. It is instantiated with an Operator, an optimizer (Adam(lr=1e-3) by default), and a loss function (MSELoss by default).

The fit method takes an OperatorDataset and trains the neural operator up to a given tolerance on the training data (but at most for a given number of epochs, 1000 by default).

from continuiti.trainer import Trainer
from continuiti.trainer.callbacks import LearningCurve

trainer = Trainer(operator)
trainer.fit(
    train_dataset,
    tol=1e-3,
    callbacks=[LearningCurve()],
    test_dataset=val_dataset,
)

Parameters: 11152  Device: mps
Epoch 610/1000  Step 3/3  [====================]  8ms/step  ETA 0:10min - loss/train = 8.5058e-04  loss/test = 9.7409e-04 - stopping criterion met

Evaluation¶

The mapping of the trained operator can be evaluated at arbitrary positions, so let's plot the prediction of $G_\theta$ on a fine resolution along with the target function.

x, u, y, v = val_dataset[0:1]

y_plot = torch.linspace(0, 1, 100).reshape(1, 1, -1)
v_pred = operator(x, u, y_plot)

Let us evaluate some training metrics, e.g., a validation error.

from continuiti.data import dataset_loss

loss_train = dataset_loss(train_dataset, operator)
loss_val = dataset_loss(val_dataset, operator)
loss_test = dataset_loss(test_dataset, operator)

loss/train = 8.0337e-04
loss/val   = 9.7409e-04
loss/test  = 6.4642e-04

As you can observe, the neural operator is able to learn the operator $G$ and generalizes well to unseen data. That's the basics!

Last update: 2024-08-20
Created: 2024-08-20