Statistics¶
Samples¶
- class desilike.statistics.samples.Samples(latex={}, **kwargs)[source]¶
Class for storing samples of parameters.
Initialize a sample of parameters.
- Parameters:
latex (dict or None, optional) – LaTeX expression for parameters. Default is
None.**kwargs – Samples of parameters. Each sample must have the same length.
- Raises:
ValueError – If not all samples have the same length.
- append(samples)[source]¶
Append a sample, i.e., add additional rows.
- Parameters:
samples (desilike.statistics.Samples) – Samples to add. Must have the same keys as the current samples.
- Raises:
ValueError – If keys do not match.
- classmethod concatenate(samples)[source]¶
Concatenate samples.
- Parameters:
samples (list of desilike.Samples) – Samples to concatenate.
- Returns:
combined – Concatenated samples.
- Return type:
desilike.Samples
- covariance(params=None)[source]¶
Compute the covariance of the sample.
- Parameters:
params (list or None, optional) – Keys to compute the covariance for. If
None, all keys are used. Default isNone.- Returns:
cov – Covariance of the samples. The ordering is the same as
keysorself.keysifkeysisNone.- Return type:
- get_flag(flag, param)[source]¶
Get the value of the status flag for all samples.
- Parameters:
flag (str) – Status flag.
param (str or None, optional) – The parameter to which the flag applies.
- Returns:
value – Boolean array contain the status flag for each sample.
- Return type:
- Raises:
ValueError – If the status is not known, the parameter does not exist for this sample, or the flag has not been set for this specific combination of status and parameter.
- getdist(params=None)[source]¶
Convert the sample into a
getdist.MCSamplesinstance.- Parameters:
params (array-like or None, optional) – List of parameters to convert. If
None, all parameters are included. Default isNone.- Returns:
Samples converted to getdist format.
- Return type:
getdist.MCSamples
- Raises:
ImportError – If getdist is not installed.
- interval(param, threshold, posterior=True)[source]¶
Get interval where likelihood/posterior is above a threshold.
- Parameters:
param (str) – Paramater for which to get the interval.
threshold (float) – Threshold such that the likelihood/posterior is at least its maximum plus the threshold. Must be negative.
posterior (bool, optional) – If
True, compute the intervals for the (log) posterior. IfFalse, the intervals for the (log) likelihood are returned. Default isTrue.
- Returns:
bounds – List of pairs of lower and upper bound. For unimodal likelihood, this should typically be a single pair. If a lower and/or upper bound cannot be determined inside the range sampled, the value will be
np.nan.- Return type:
list
- Raises:
ValueError – If
thresholdis not negative.RuntimeError – If the likelihood/posterior is identical to the maximum plus the threshold over some range instead of specific points.
- property keys¶
Return the keys of the sample as a list of strings.
- classmethod load(filepath)[source]¶
Read samples from a file.
This function supports
npz, andhdf5file endings.- Parameters:
filepath (str or Path) – Where to read samples from.
- Raises:
ValueError – If file ending is not supported or file ending is
hdf5buth5pyis not installed.
- mean(params=None, return_as_dict=False)[source]¶
Compute the mean of the sample.
- Parameters:
params (list or None, optional) – Keys to compute the mean for. If
None, all keys are used. Default isNone.return_as_dict (bool, optional) – If
True, return a dictionary. Otherwise, return a numpy array. Default isFalse.
- Returns:
means – Means of the samples.
- Return type:
list or dict
- property params¶
Return the parameters of the sample as a list of strings.
- profile_interpolator(param, posterior=True)[source]¶
Get a cubic profile interpolator.
- Parameters:
param (str or list) – Parameter(s) for which to compute the interpolator.
posterior (bool, optional) – If
True, get a profile for the (log) posterior. IfFalse, a profile for the (log) likelihood is returned. Default isTrue.
- Returns:
interp – Profile interpolator.
- Return type:
scipy.interpolate.CubicSpline or scipy.interpolate.RegularGridInterpolator
- Raises:
ValueError – If there are not enough points to compute an interpolation.
- save(filepath, keys=None)[source]¶
Save samples to a file.
This function supports
csv,npz, andhdf5file endings.csvis typically used for sharing results outside ofdesilike.- Parameters:
filepath (str or Path) – Where to save samples.
keys (list or None, optional) – Keys to write. If
None, all keys are used. Default isNone.
- Raises:
ValueError – If file ending is not supported, file ending is
hdf5buth5pyis not installed, or parameters to be saved are multidimensional and the output iscsv.
- set_flag(flag, param, value)[source]¶
Get the value of the status flag for all samples.
- Parameters:
flag (str) – Status flag.
param (str or None, optional) – The parameter to which the flag applies.
value (numpy.ndarray) – Boolean array contain the status flag for each sample.
- Raises:
ValueError – If the status is not known or the parameter does not exist for this sample.
- tabulate(keys=None, use_latex=False, **kwargs)[source]¶
Use the tabulate package to print the table.
- Parameters:
keys (array-like or None, optional) – List of keys to print. If
None, all columns are printed. Default isNone.use_latex (bool, optional) – Whether to use the LaTeX names in the columns headers. Default is
False.**kwargs – Additional keyword arguments passed to
tabulate.tabulate().
- Returns:
table – Table as plain text.
- Return type:
str
- Raises:
ImportError – If tabulate is not installed.
- property weight¶
Return the (normalized) weight of each sample.
Diagnostics¶
Module implementing diagnostics for Markov chains.
- desilike.statistics.diagnostics.gelman_rubin(chains, n_splits=None, keys=None)[source]¶
Estimate the Gelman-Rubin statistic.
- Parameters:
chains (desilike.statistics.samples.Samples, list of desilike.statistics.samples.Samples, or numpy.ndarray) –
Chains for which to compute the Gelman-Rubin statistic. If a numpy array, the expected shapes are as follows:
(n_steps,) if one-dimensional
(n_steps, n_dim) if two-dimensional
(n_chains, n_steps, n_dim) if three-dimensional
n_splits (int or None, optional) – Number of splits for each chain. If
None, a single chain will be split into 2 parts. Splitting allows computation of Gelman-Rubin statistics even with one chain. Default isNone.keys (list of str, optional) – Keys for which to compute the Gelman-Rubin statistic. Only used if
chainsis adesilike.Samplesor list thereof. IfNone, use all keys in the chain. Default isNone.
- Returns:
gr –
- The estimated Gelman-Rubin statistics.
dict if
chainsis adesilike.Samplesor list thereoffloat if
chainsis a one-dimensional arraynumpy.ndarray otherwise
- Return type:
dict, float, or numpy.ndarray
- Raises:
ValueError – If
n_chains * n_splitsis 1 orn_splitsis larger than the number of samples.
- desilike.statistics.diagnostics.integrated_autocorrelation_time(chains, keys=None)[source]¶
Estimate the integrated autocorrelation time for Markov chains.
Autocorrelation times are computed in the same way as in
emcee. See https://emcee.readthedocs.io/en/stable/tutorials/autocorr/ for details. While the results have been verified to agree with those fromemcee, although the implementation is independent.- Parameters:
chains (desilike.statistics.samples.Samples, list of desilike.statistics.samples.Samples, or numpy.ndarray) –
Chains for which to compute the autocorrelation time. If a numpy array, the expected shapes are as follows:
(n_steps,) if one-dimensional
(n_steps, n_dim) if two-dimensional
(n_chains, n_steps, n_dim) if three-dimensional
keys (list of str, optional) – Keys for which to compute the autocorrelation time. Only used if
chainsis adesilike.Samplesor list thereof. IfNone, use all keys in the chain. Default isNone.
- Returns:
tau – The estimated autocorrelation times.
dict if
chainsis adesilike.Samplesor list thereoffloat if
chainsis a one-dimensional arraynumpy.ndarray otherwise
In all cases, the autocorrelation function (not time) for each parameter is averaged across chains, if multiple chains are provided.
- Return type:
dict, float, or numpy.ndarray
Plotting¶
Module implementing plotting routines.
- desilike.statistics.plotting.gelman_rubin(chains, keys=None, colors=None, n_splits=None, threshold=None, slices=100, offset=None, fontsize=None, plot_options=None, legend_options=None, fig=None)[source]¶
Plot Gelman-Rubin statistics as a function of steps.
- Parameters:
chains (desilike.Samples or list of desilike.Samples) – List of (or single) :class:
desilike.Samplesinstance(s).keys (list or None, optional) – Parameters to plot the Gelman-Rubin statistic for. If
None, plot all parameters. Default isNone.colors (str, list, or None, optional) – Dictionary of (or single) color(s) for parameters. Default is
None.n_splits (int or None, optional) – Number of splits for each chain. If
None, a single chain will be split into 2 parts. Splitting allows computation of Gelman-Rubin statistics even with one chain. Default isNone.threshold (float, optional) – If not
None, plot horizontal line at this value. Default isNone.slices (int, optional) – Number of linearly spaced steps for which to compute the Gelman-Rubin statistic. Default is 100.
offset (float or None, optional) – Offset to apply to the Gelman-Rubin statistics, typically 0 or -1. Default is
None.fontsize (int or None, optional) – Label sizes. Default is None.
plot_options (dict or None, optional) – Optional arguments for matplotlib.axes.Axes.plot. Default is
None.legend_options (dict or None, optional) – Optional arguments for matplotlib.axes.Axes.legend. Default is
None.fig (matplotlib.figure.Figure or None, optional) – Figure to plot on. If
None, create a new one. Default isNone.
- Raises:
ValueError – If not all chains have the same length.
- Returns:
fig – Figure with plot on it.
- Return type:
matplotlib.figure.Figure
- desilike.statistics.plotting.integrated_autocorrelation_time(chains, keys=None, colors=None, slices=10, fontsize=None, plot_options=None, legend_options=None, fig=None)[source]¶
Plot integrated autocorrelation time as a function of steps.
- Parameters:
chains (desilike.Samples or list of desilike.Samples) – List of (or single) :class:
desilike.Samplesinstance(s).keys (list or None, optional) – Parameters to plot the integrated autocorrelation time for. If
None, plot all parameters. Default isNone.colors (str, list, or None, optional) – Dictionary of (or single) color(s) for parameters. Default is
None.slices (int, optional) – Number of linearly spaced steps for which to compute the integrated autocorrelation time. Default is 10.
fontsize (int or None, optional) – Label sizes. Default is None.
plot_options (dict or None, optional) – Optional arguments for matplotlib.axes.Axes.plot. Default is
None.legend_options (dict or None, optional) – Optional arguments for matplotlib.axes.Axes.legend. Default is
None.fig (matplotlib.figure.Figure or None, optional) – Figure to plot on. If
None, create a new one. Default isNone.
- Raises:
ValueError – If not all chains have the same length.
- Returns:
fig – Figure with plot on it.
- Return type:
matplotlib.figure.Figure
- desilike.statistics.plotting.one_dimensional_profile(samples, param, ax=None, plot=True, plot_kwargs=None, scatter=False, scatter_kwargs=None)[source]¶
Add 1D profile to axes.
- Parameters:
samples (desilike.Samples) –
desilike.Samplesinstance returned from a profiler.param (str) – Parameter to plot profile for.
ax (matplotlib.axes.Axes, default=None) – Axes where to add profile. If
None, useplt.gca(). Default isNone.plot (bool, optional) – Whether to interpolate and plot the profile. Default is
True.plot_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.plot(). Default isNone.scatter (bool, optional) – Whether the plot individual points. Default is
False.scatter_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.scatter(). Default isNone.
- Raises:
ValueError – If both or neither of the posterior and likelihood are given.
- desilike.statistics.plotting.plotter(f)[source]¶
Add plotting arguments and check if
matplotlibis installed.- Parameters:
filepath (str, pathlib.Path or None, optional) – If not
None, save the figure to that location. Default isNone.show (bool, optional) – If True, show the figure. Default is
False.save_options (dict or None, optional) – Additional options passed to the
savefigfunction ofmatplotlib. Default isNone.
- Raises:
ImportError – If
matplotlibis not installed.
- desilike.statistics.plotting.trace(chains, keys=None, colors=None, fontsize=None, plot_options=None, fig=None)[source]¶
Make trace plot as a function of steps, with a panel for each parameter.
- Parameters:
chains (desilike.Samples or list of desilike.Samples) – List of (or single) :class:
desilike.Samplesinstance(s).keys (list or None, optional) – Parameters to plot trace for. If
None, plot all parameters. Default isNone.colors (str, list, or None, optional) – List of (or single) color(s) for chains. Default is
None.fontsize (int or None, optional) – Label sizes. Default is None.
plot_options (dict or None, optional) – Optional arguments for matplotlib.axes.Axes.plot. Default is
None.fig (matplotlib.figure.Figure or None, optional) – Figure to plot on. If
None, create a new one. Default isNone.
- Raises:
ValueError – If the provided figure has less axes than the chains have keys.
- Returns:
fig – Figure with plot on it.
- Return type:
matplotlib.figure.Figure
- desilike.statistics.plotting.triangle_posterior(samples, params=None, **kwargs)[source]¶
Create a triangle posterior plot using
getdist.References
- Parameters:
samples (desilike.Samples or list of desilike.Samples) – List of (or single)
desilike.Samplesinstance(s).params (list or None, optional) – Parameters to plot posterior for. If
None, plot all parameters. Default isNone.**kwargs – Optional parameters for
getdist.plots.GetDistPlotter.triangle_plot().
- Raises:
ImportError – If
getdistis not installed.
- desilike.statistics.plotting.triangle_profile(samples, params=None, plot=True, plot_kwargs=None, levels=[1.14, 3.0, 4.61], contour_kwargs=None, scatter=False, scatter_kwargs=None, threshold=4.5, fig=None)[source]¶
Create a triangle profile plot.
- Parameters:
samples (desilike.Samples) – Samples for which to plot the profile for.
params (list or None, optional) – Parameters to plot profile for. If
None, plot all parameters. Default isNone.plot (bool, optional) – Whether to interpolate and plot the one-dimensional profiles. Default is
True.plot_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.plot(). Default isNone.levels (list, optional) – Confidence levels to plot for the two-dimensional profiles, i.e., the values \(\Delta \log \mathcal{P}\) where \(\log \mathcal{P} = \max \log \mathcal{P} - \Delta \log \mathcal{P}\). Default is [1.14, 3.00, 4.61] which corresponds to the 68%, 95%, and 99% credible intervals of a two-dimensional Gaussian.
contour_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.contour(). Default isNone.scatter (bool, optional) – Whether the plot individual points. Default is
False.scatter_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.scatter(). Default isNone.threshold (float, optional) – Limit the ranges for each parameter to the corresponding intervals for this threshold. Default is 4.5.
fig (matplotlib.figure.Figure or None, optional) – Figure to plot on. If
None, create a new one. Default isNone.
- desilike.statistics.plotting.two_dimensional_profile(samples, params, ax=None, levels=[-4.61, -3.0, -1.14], contour_kwargs=None, scatter=False, scatter_kwargs=None)[source]¶
Add 2D profile to axes.
- Parameters:
samples (desilike.Samples) –
desilike.Samplesinstance returned from a profiler.params (tuple of str) – Parameters to plot profile for.
ax (matplotlib.axes.Axes, default=None) – Axes where to add profile. If
None, useplt.gca(). Default isNone.levels (list, optional) – Confidence levels to plot, i.e., the values \(z\) where \(\log \mathcal{P} = \max \log \mathcal{P} + z\). Default is [-4.61, -3.00, -1.14] which correspond to the 68%, 95%, and 99% credible intervals for a two-dimensional Gaussian.
contour_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.contour(). Default isNone.scatter (bool, optional) – Whether the plot individual points. Default is
False.scatter_kwargs (dict or None, optional) – Optional arguments for
matplotlib.axes.Axes.scatter(). Default isNone.
- Raises:
ValueError – If both or neither of the posterior and likelihood are given or an incorrect number of parameters is given.