axtreme.data.importance_datasetΒΆ

Dataset that return importance sample information in the form (data, importance_weight).

Dev details:

Datasets can return a variety of things when __get_item__ is called, for example: - Tuples - Dicts

The dataloader will respect objects like dict and tuple, and will convert float/int/list/numpy content into tensors as it is assumed to be data. - By default done by collate_fn arg of Dataloader. - For defaults see torch.utils.data._utils.collate.default_collate

While this is straight forward to implement, typing can be a challenge (torch.utils.data.Dataset provides some guidances, but appears they found it challenging too).

Todo

  • Revisit the typing and the implications of covariate/contravariant etc.

Classes

ImportanceAddedWrapper(data_dataset, ...)

Thin wrapper makes the method for creating the dataset more explicit, and ensure order and type of output.

ImportanceIndexWrapper(dataset, importance_idx)

Wraps an existing dataset, returning a column/index of the item as the importance weight.