axtreme.utils.population_estimators¶
Helpers for understanding the population values expected by estimators.
NOTE: These tool provide indicative/approximate result.
Functions
|
Construct a distibution from a sample, and get the pdf value at point x. |
|
Plot the distribution PDF over domain mean +- confidence_level * std. |
|
Distibution of the population mean as estimated by this sample. |
|
Distibution of the population median as estimated by this sample. |
- axtreme.utils.population_estimators.estimate_pdf_value_from_sample(sample: Tensor, x: float) float ¶
Construct a distibution from a sample, and get the pdf value at point x.
WARNING: This is an approximate method, and results impove with more samples. See testing results below.
- Parameters:
sample – 1d Samples to construct a pdf from.
x – the point at which to evaluate the pdf.
- Returns:
Estimated pdf value.
Testing results: The mean returned and the cof have the following behaviour. Full test detail can be run at tests/utils/test_population_estimators.py : visualise_performance_of_estimate_pdf_value_from_sample
Number of samples | mean_est/true | Coef |¶
11 | .86 - 1.02 | .25-.30 | 22 | .88 - 1.02 | .18-.20 | 44 | .90 - 1.01 | .14-.16 | 88 | .92 - 1.00 | .11-.13 | 176 | .95 - 1.00 | .05-.10 |
- axtreme.utils.population_estimators.plot_dist(dist: Distribution, confidence_level: float = 3.0, ax: Axes | None = None, **kwargs: Any) Axes ¶
Plot the distribution PDF over domain mean +- confidence_level * std.
- Parameters:
dist – the distribution to plot the pdf of.
confidence_level – controls the width of the plot
ax – the axes to plot on. If None, will create an x
**kwargs – passed to the plotting method.
- Returns:
Axes with the plot.
- axtreme.utils.population_estimators.sample_mean_se(samples: Tensor) StudentT ¶
Distibution of the population mean as estimated by this sample.
Note
The distribution of the sample itself doesn’t matter. The output distibution is not effected by this.
Use the following link
se = sigma / n**.5
Sigma: should be the population standard deviation, but we approximate this with the sample standard deviation
Because of this approximation we use the Student-t distribution
- Parameters:
samples – 1d tensor of sample to estimate the population mean from
- Returns:
Distribution of the population mean based on the provided sample. Additionally, it provides the 95% confidence bounds for the estimate, which are typically calculated to fall within a range of 93% to 97% coverage, depending on sample variability and the assumptions of the calculation method.
Todo
- .cdf() raises NotImplementedError for torch implemenation of StudentT. This is annoying
because this is the best way to check the confidence bounds (using z = (y - mean)/stddev) assumes you are using a normal distibution rather than a student t distibution. This approximation is considered okay for n>30)
- axtreme.utils.population_estimators.sample_median_se(samples: Tensor) Normal ¶
Distibution of the population median as estimated by this sample.
Details of this method can be found here.
Note
This function relies on the approximation estimate_pdf_value_from_sample. The approximation can be quite inaccurate (see the function for details), and as a result this function should be treated as an estimate.
Note
The result returned are much more noisey than sample_mean_se.
- Parameters:
samples – 1d tensor of sample to estimate the population median from
- Returns:
Distibution of the population median as estimated by this sample. The 95% bounds (with 50 samples) estimated by this function typically produce bounds actually between 90%-97%.