GPBruteForce

class axtreme.qoi.gp_bruteforce.GPBruteForce(env_iterable: Iterable[Tensor], input_transform: InputTransform | None = None, outcome_transform: OutcomeTransform | None = None, posterior_sampler: PosteriorSampler | None = None, *, erd_samples_per_period: int = 1, shared_surrogate_base_samples: bool = False, device: device | None = None, no_grad: bool = True, seed: int | None = None)

Bases: MeanVarPosteriorSampledEstimates, QoIEstimator

Estimate the QoI for an extreme response distribution, using a surrogate model.

Uses a full periods of environment samples, and passes each sample through the surrogate.

Overview of the algorithem:

  • Take n periods of environment data points.

  • Use the surrogate model to estimate the likely response distribution at each point (the posterior).

  • Take n_posterior_samples of the posterior, each representing a guess at what the true simulator is.

  • For each posterior sample:

    • Simulate the response seen at each data point.

    • Find the largest response in each period. Each of these is a sample of that posterior’s ERD.

    • Calculate the QoI based on these ERD samples.

  • Return the QoIs calculated by each posterior sample.

Uncertainty in the results comes from three sources:

  • The envirionment samples used.

  • Uncertainty in the GP and the posterior samples used.

  • Randomness from sampling the surrogates output distribution.

Optimisation Notes:

GPBruteForce is not smooth w.r.t to smooth changes in the model (e.g like provided By QoILookAhead).

Todo

Provide reference to the gpbruteforce.md file in docs so it renders (sw 2024-11-29).

__init__(env_iterable: Iterable[Tensor], input_transform: InputTransform | None = None, outcome_transform: OutcomeTransform | None = None, posterior_sampler: PosteriorSampler | None = None, *, erd_samples_per_period: int = 1, shared_surrogate_base_samples: bool = False, device: device | None = None, no_grad: bool = True, seed: int | None = None) None

Initialise the QOI estimator.

Parameters:
  • env_iterable

    An iterable that produces the env data to be used. Typically this is a DataLoader.

    • The iterable contains batches of shape (n_periods, batch_size, d).

    • Combining all of the batch should produce the shape (n_periods, period_len, d).

    • This is an iterable because predictions often need to be made in batches for memory reasons.

    • If your data is small, you can process it all at once by passing [data], where data is a tensor.

  • input_transform – Transforms that should be applied to the env_samples before being passed to the model.

  • outcome_transform – Transforms that should be applied to the output of the model before they are used.

  • posterior_sampler

    The sampler to use to draw samples from the posterior of the GP.

    • n_posterior_samples is set in the PosteriorSampler.

    Note

    If env_iterable contains batches, a batch-compatible sampler, such as NormalIndependentSampler, should be chosen.

  • erd_samples_per_period – Number of ERD samples created from a single period of data. This can reduce the noise of sampling the response drawn from the surrogate’s response distribution (at a point ‘x’).

  • shared_surrogate_base_samples

    If True, all n_posterior_samples will use the same base samples when sampling the surrogate’s response output. As a result, the posterior samples are responsible for any difference in ERD distribution (e.g., surrogate sampling noise no longer contributes).

    • Set to False: Better shows overall uncertainty in QoI.

    • Set to True: Shows only uncertainty caused by GP uncertainty.

  • device – The device that the model should be run on.

  • no_grad – Whether to disable gradient tracking for this QOI calculation.

  • seed – The seed to use for the random number generator. If None, no seed is set.

Methods

__init__(env_iterable[, input_transform, ...])

Initialise the QOI estimator.

mean(x)

Function that computes the mean of the estimates produced by using self.posterior_sampler.

posterior_samples_erd_samples(model)

Returns the erd samples created by each posterior sample.

sample_surrogate(params[, n_samples, ...])

Create the surrogate model for a given set of input parameters, and sample response of the surrogate.

var(x)

Function that computes the variance of the estimates produced by using self.posterior_sampler.

Attributes

posterior_sampler

posterior_samples_erd_samples(model: Model) Tensor

Returns the erd samples created by each posterior sample.

__call__ uses these erd sample to create a QoI estimate per posterior.

Parameters:

model – The GP model to use for the QOI estimation. It should have output dimension 2 which represents the location and scale of a Gumbel distribution.

Returns:

The erd samples obtained for each function (posterior sample) obtianed from the GP. Shape: (n_posterior_samples, n_periods * erd_samples_per_period)

Return type:

Tensor

static sample_surrogate(params: Tensor, n_samples: int = 1, base_sample_broadcast_dims: list[int] | None = None) Tensor

Create the surrogate model for a given set of input parameters, and sample response of the surrogate.

Typically a GP is used to parameterise the surrogate model at a specific x. The now parameterise model can be run multiple times to get different realisations of the stochastic response.

Parameters:
  • params – (*b, p) tensor of parameters. The last dimesion is expected to contain the parameters required to

  • dimension. (instantiate a single surrogate model. All other dimensions are optional batch)

  • n_samples – The number of samples to draw from the surrogate model at a single x point.

  • base_sample_broadcast_dims

    List of indexes in (*b). Base samples will be shared (broadcast) across these dimension of *b. For example:

    • params.shape is (n_posterior_samples, n_periods, batch_size, n_params).

      • *b = (n_posterior_samples, n_periods, batch_size)

      • p = (n_params)

    • You would like to use the same base samples for each n_posterior_samples, so that any difference in output can be attributed to the difference in the n_params, rather than due to the randomness in the sample generated by the surrogate mode.

    • By setting base_sample_broadcast_dims=[0] the base samples used would be of shape (1, n_periods, batch_size), which would achieve the desired effect.

Returns:

tensor of (n_samples, *b) representing the response of the surrogate model.

Todo

The base_sample_broadcast_dims behaviour is challenging to describe now that is in this function rather than in context. Alternately base samples could be directly applied like in posterior.rsample_from_base_samples. We have avoid this so complexity is contained here for now.