Kernels¶
This package implements the different covariance kernels that one can use in volcapy.
Its main goal is to compute the covariance pushforward \(K F^t\), where \(K\) is the model covariance matrix and \(F\) is the forward operator.
IMPORTANT: Note that we always strip the variance parameter \(\sigma_0^2\) from the covariance matrix. Hence, when using the covariance pushforward computed here, one has to manually multiply by \(\sigma_0^2\) for expressions to make sense.
Each of the kernels should implement the three methods below
def compute_cov_pushforward(lambda0, F, cells_coords, device,
n_chunks=200, n_flush=50):
""" Compute covariance pushforward
"""
def compute_cov(lambda0, cells_coords, i, j):
""" Compute the covariance bewtween cells i and j of the model.
"""
def compute_full_cov(lambda0, cells_coords, device,
n_chunks=200, n_flush=50):
""" Compute the full covariance matrix. Note that due to the
:math:`n_m^2` size, this should only be
attempted on small models.
"""
A detailed description of the arguments is available at the end of this section.
Handling out of Memory Errors¶
Due to the size of the covariance matrix, care has to be taken when computing its product with the forward. Let \(n_m\) be the number of model cells. Then the covariance matrix has size \(n_m^2\), which for 10000 cels already takes more than 160 Gb of memory.
The strategy used here is to compute the matrix in chunks. We compute matrix
products of the form \(K A\) by computing the rows of the resulting matrix
in chunks of size n_chunks
. This then only involves n_chunks
of
the covariance matrix at a time.
Hence what we do is compute such a chunk of the covariance matrix on GPU,
multiply it with the right hand side matrix and send the result back to CPU
where it is concatenated with the previously computed chunks, while the freed
GPU memory is used to compute the next chunk.
We noticed that CUDA tends to keep arbitrary data in cache, which after computing a certain number of chunks will fill the GPU memory. The cache thus has to be manually flushed every :code`n_flush` chunks.
Flushing takes a long time, so one shouldn’t do it to often. The value of
n_flush
should be as high as possible to avoid flushing too often. The
optimal value should be determined experimentally by the user.
Matérn 3/2¶
The implementation of the Matérn 3/2 kernel is provided as example below.
-
volcapy.covariance.matern32.
compute_cov
(lambda0, cells_coords, i, j)[source]¶ Compute the covariance between two points.
Note that, as always, sigma0 has been stripped.
- Parameters
- lambda0: float
Lenght-scale parameter
- cells_coords: tensor
n_cells * n_dims: cells coordinates
- i: int
Index of first cell (index in the cells_coords array).
- j: int
Index of second cell.
- Returns
- Tensor
(Stripped) covariance between cell nr i and cell nr j.
-
volcapy.covariance.matern32.
compute_cov_pushforward
(lambda0, F, cells_coords, device, n_chunks=200, n_flush=50)[source]¶ Compute the covariance pushforward.
The covariance pushforward is just KF^T, where K is the model covariance matrix.
Note that the sigam0^2 is not included, and one has to manually add it when using the covariance pushforward computed here.
- Parameters
- lambda0: float
Lenght-scale parameter
- F: tensor
Forward operator matrix
- cells_coords: tensor
n_cells * n_dims: cells coordinates
- device: toch.Device
Device to perform the computation on, CPU or GPU.
- n_chunks: int
Number of chunks to split the matrix into. Default is 200. Increase if get OOM errors.
- n_flush: int
Synchronize threads and flush GPU cache every n_flush iterations. This is necessary to avoid OOM errors. Default is 50.
- Returns
- Tensor
n_model * n_data covariance pushforward K F^t.
-
volcapy.covariance.matern32.
compute_full_cov
(lambda0, cells_coords, device, n_chunks=200, n_flush=50)[source]¶ Compute the full covariance matrix.
Note that the sigam0^2 is not included, and one has to manually add it when using the covariance pushforward computed here.
- Parameters
- lambda0: float
Lenght-scale parameter
- cells_coords: tensor
n_cells * n_dims: cells coordinates
- device: toch.Device
Device to perform the computation on, CPU or GPU.
- n_chunks: int
Number of chunks to split the matrix into. Default is 200. Increase if get OOM errors.
- n_flush: int
Synchronize threads and flush GPU cache every n_flush iterations. This is necessary to avoid OOM errors. Default is 50.
- Returns
- Tensor
n_cells * n_cells covariance matrix.