axtreme.acquisition.qoi_look_ahead¶
QoILookAhead acquisition function that looks ahead at possible models and optimize according to a quantity of interest.
Functions
|
Return the average observational noise. |
|
Find the closest point in a training dataset, and collect its observational noise. |
|
A wrapper around BatchedMultiOutputGPyTorchModel.condition_on_observations with a number of safety checks. |
|
This is how default arguments for acquisition functions are handled in Ax/Botorch. |
|
Helper function to reject batched model in code where they are not yet supported. |
Classes
|
QoILookAhead is a generic acquisition function that estimates the usefulness of a design point. |
- axtreme.acquisition.qoi_look_ahead.average_observational_noise(new_points: Tensor, train_x: Tensor | None, train_yvar: Tensor) Tensor ¶
Return the average observational noise.
- Parameters:
new_points – (n, d) The points to produce observational_noise.
train_x – (n’,d). This is not used, but is kept to keep a consistent function signature
train_yvar – (n’,m) The observational variance associated with each point.
- Returns:
(n,m) Tensor with the variance for each of the new_points.
Details: This function is useful for Non-Batched SingleTaskGPs because they will always have arguments of these dimension.
Warning
Certain pattern of homoskedasticity cause this method to perform poorly, causing the acquisition function to recommend suboptimal points (as compared to closest_observational_noise). The trade off is derivative based optimisation techniques can be used. See Issue #213 for details.
- axtreme.acquisition.qoi_look_ahead.closest_observational_noise(new_points: Tensor, train_x: Tensor, train_yvar: Tensor) Tensor ¶
Find the closest point in a training dataset, and collect its observational noise.
- Parameters:
new_points – (n, d) The points to produce observational_noise. Features should be normalise to [0,1] square.
train_x – (n’,d) The points to compare similarity to. Features should be normalise to [0,1] square.
train_yvar – (n’,m) The observational variance associated with each point.
- Returns:
(n,m) Tensor with the variance for each of the new_points.
Details: This function is useful for Non-Batched SingleTaskGPs because they will always have arguments of these dimension.
Warning
This function is not smooth, meaning optimizers that use gradient (1st or 2nd order derivatives) such as L-BFGS-B will not work. The trade off is it is more robust to the effect of patterns in yvar than average_observational_noise . See Issue #213 for details.
- axtreme.acquisition.qoi_look_ahead.conditional_update(model: Model, X: Tensor, Y: Tensor, observation_noise: Tensor | None) Model ¶
A wrapper around BatchedMultiOutputGPyTorchModel.condition_on_observations with a number of safety checks.
This function adds an additional datapoint to the model, preserving the dimension of the original model. Does not changing any of the models hyperparameters. This is like training a new SingleTaskGP with all the datapoints (Hyperparameters are not fit in SingleTaskGP).
- Parameters:
model (Model) – The model to update.
X –
As per condition_on_observations. Shape (*b, n’, d).
Note
condition_on_observations expects this to be in the “model” space. It will not be transformed by the input_transform on the model.
Y –
As per condition_on_observations. Shape (*b, n’, m).
Note
condition_on_observations expects this to be in the “output/problem” space (not model space). It will be transformed by the output_transform on the model.
observation_noise (torch.Tensor | None) –
Used as the noise argument in condition_on_observations. Shape should match Y (*b, n’, m).
Note
condition_on_observations expects this to be in the “model” space. It will not be transformed by the output_transform on the model.
- Returns:
GP with the same underlying structure, including the new points, and the same original number of dimensions.
- Developer Note:
There are different ways to create a fantasy model. The following were considered:
BatchedMultiOutputGPyTorchModel.condition_on_observations: well documented interface producing a GP of the same format.
model.get_fantasy_model: This is a Gpytorch implementation. Interface uses different notation, and input shape need to be manually adjusted depending on the model.
model.fantasize: This method would be very convient for our wider purpose, but its posteriors is of shape (num_fantasies, batch_shape, n, m). Unclear if our QoI methods can handle/respect the num_fantasies dim.
Revisit this at a later date.
- axtreme.acquisition.qoi_look_ahead.construct_inputs_qoi_look_ahead(model: SingleTaskGP, qoi_estimator: QoIEstimator, sampler: PosteriorSampler, **_: dict[str, Any]) dict[str, Any] ¶
This is how default arguments for acquisition functions are handled in Ax/Botorch.
- Context
When Ax.BOTORCH gets instantiated, construction arguments for the acquisition function can be provided. These are passed through Ax as a set of Kwargs
- Parameters:
defaults. (This function takes a subset of the acquisition functions __init__() args and can add)
- Returns:
Args for the Botorch acquisition function __init__() (output).
Note
This functionality allows Ax to pass generic arguments without needing to know which acquisition function they will be passed to. Interestingly, this functionality is provided by the BoTorch package, even though it seems like it should be the responsibility of Ax. This issue is discussed in detail here: GitHub discussion.
- axtreme.acquisition.qoi_look_ahead.reject_if_batched_model(model: SingleTaskGP) None ¶
Helper function to reject batched model in code where they are not yet supported.
- Parameters:
model – The model to check
- Returns:
Raise not yet implements in model is batched. Otherwise None.
- Details:
botorch models can have batched training data, and or batched.
gp batch prediction (non-batched model):
train_x = (n,d) # This is a single GP
train_y = (n,m)
predicting_x = (b,n’,d)
result will be: (b, n’,m).
There are b seperate joint distribution (each with n points, a t targets)
batched gps model:
train_x = (b_gp,n,d)
train_y = (b_gp,n,m)
b_gp seperate GPs, where each GP gets all its own hyperparams etc trained on (n,d) point.
prediciton_x = (n’,d)
result will be: (b_gp, n’,m).
Each of the seperate b_gp gps makes its own estimate of the joint distribution.
More details: BoTorch batching