alibi.explainers.anchor_base module¶
-
class
alibi.explainers.anchor_base.
AnchorBaseBeam
(samplers, **kwargs)[source]¶ Bases:
object
-
anchor_beam
(delta=0.05, epsilon=0.1, desired_confidence=1.0, beam_size=1, epsilon_stop=0.05, min_samples_start=100, max_anchor_size=None, stop_on_first=False, batch_size=100, coverage_samples=10000, verbose=False, verbose_every=1, **kwargs)[source]¶ Uses the KL-LUCB algorithm (Kaufmann and Kalyanakrishnan, 2013) together with additional sampling to search feature sets (anchors) that guarantee the prediction made by a classifier model. The search is greedy if beam_size=1. Otherwise, at each of the max_anchor_size steps, beam_size solutions are explored. By construction, solutions found have high precision (defined as the expected of number of times the classifier makes the same prediction when queried with the feature subset combined with arbitrary samples drawn from a noise distribution) The algorithm maximises the coverage of the solution found - the frequency of occurrence of records containing the feature subset in set of samples.
- Parameters
delta (
float
) – Used to compute beta.epsilon (
float
) – Precision bound tolerance for convergence.desired_confidence (
float
) – Desired level of precision (tau in paper).beam_size (
int
) – Beam width.epsilon_stop (
float
) – Confidence bound margin around desired precision.min_samples_start (
int
) – Min number of initial samples.max_anchor_size (
Optional
[int
]) – Max number of features in result.stop_on_first (
bool
) – Stop on first valid result found.coverage_samples (
int
) – Number of samples from which to build a coverage set.batch_size (
int
) – Number of samples used for an arm evaluation.verbose (
bool
) – Whether to print intermediate LUCB & anchor selection output.verbose_every (
int
) – Print intermediate output every verbose_every steps.
- Return type
- Returns
Explanation dictionary containing anchors with metadata like coverage and precision
and examples.
-
static
dlow_bernoulli
(p, level, n_iter=17)[source]¶ Update lower precision bound for a candidate anchors dependent on the KL-divergence.
- Parameters
p (
ndarray
) – Precision of candidate anchors.level (
ndarray
) – beta / nb of samples for each result.n_iter (
int
) – Number of iterations during lower bound update.
- Return type
ndarray
- Returns
Updated lower precision bounds array.
-
static
dup_bernoulli
(p, level, n_iter=17)[source]¶ Update upper precision bound for a candidate anchors dependent on the KL-divergence.
- Parameters
p (
ndarray
) – Precision of candidate anchors.level (
ndarray
) – beta / nb of samples for each result.n_iter (
int
) – Number of iterations during lower bound update.
- Return type
ndarray
- Returns
Updated upper precision bounds array.
-
get_anchor_metadata
(features, success, batch_size=100)[source]¶ Given the features contained in a result, it retrieves metadata such as the precision and coverage of the result and partial anchors and examples where the result/partial anchors apply and yield the same prediction as on the instance to be explained (covered_true) or a different prediction (covered_false).
- Parameters
features (
tuple
) – Sorted indices of features in result.success – Indicates whether an anchor satisfying precision threshold was met or not.
batch_size (
int
) – Number of samples among which positive and negative examples for partial anchors are selected if partial anchors have not already been explicitly sampled.
- Return type
- Returns
Anchor dictionary with result features and additional metadata.
param success:
-
get_init_stats
(anchors, coverages=False)[source]¶ Finds the number of samples already drawn for each result in anchors, their comparisons with the instance to be explained and, optionally, coverage.
-
kllucb
(anchors, init_stats, epsilon, delta, batch_size, top_n, verbose=False, verbose_every=1)[source]¶ Implements the KL-LUCB algorithm (Kaufmann and Kalyanakrishnan, 2013).
- Parameters
anchors (
list
) – A list of anchors from which two critical anchors are selected (see Kaufmann and Kalyanakrishnan, 2013).init_stats (
dict
) – Dictionary with lists containing nb of samples used and where sample predictions equal the desired label.epsilon (
float
) – Precision bound tolerance for convergence.delta (
float
) – Used to compute beta.batch_size (
int
) – Number of samples.top_n (
int
) – Min of beam width size or number of candidate anchors.verbose (
bool
) – Whether to print intermediate output.verbose_every (
int
) – Whether to print intermediate output every verbose_every steps.
- Return type
ndarray
- Returns
Indices of best result options. Number of indices equals min of beam width or nb of candidate anchors.
-
select_critical_arms
(means, ub, lb, n_samples, delta, top_n, t)[source]¶ Determines a set of two anchors by updating the upper bound for low emprical precision anchors and the lower bound for anchors with high empirical precision.
- Parameters
means (
ndarray
) – Empirical mean result precisions.ub (
ndarray
) – Upper bound on result precisions.lb (
ndarray
) – Lower bound on result precisions.n_samples (
ndarray
) – The number of samples drawn for each candidate result.delta (
float
) – Confidence budget, candidate anchors have close to optimal precisions with prob. 1 - delta.top_n (
int
) – Number of arms to be selected.t (
int
) – Iteration number.
- Returns
Upper and lower precision bound indices.
-
static
to_sample
(means, ubs, lbs, desired_confidence, epsilon_stop)[source]¶ Given an array of mean result precisions and their upper and lower bounds, determines for which anchors more samples need to be drawn in order to estimate the anchors precision with desired_confidence and error tolerance.
- Parameters
means (
ndarray
) – Mean precisions (each element represents a different result).ubs (
ndarray
) – Precisions’ upper bounds (each element represents a different result).lbs (
ndarray
) – Precisions’ lower bounds (each element represents a different result).desired_confidence (
float
) – Desired level of confidence for precision estimation.epsilon_stop (
float
) – Tolerance around desired precision.
- Returns
Boolean array indicating whether more samples are to be drawn for that particular result.
-
update_state
(covered_true, covered_false, labels, samples, anchor)[source]¶ Updates the explainer state (see __init__ for full state definition).
- Parameters
covered_true (
ndarray
) – Examples where the result applies and the prediction is the same as on the instance to be explained.covered_false (
ndarray
) – Examples where the result applies and the prediction is the different to the instance to be explained.samples (
tuple
) – A tuple containing discretized data, coverage and the result sampled.labels (
ndarray
) – An array indicating whether the prediction on the sample matches the label of the instance to be explained.anchor (
tuple
) – The result to be updated.
- Return type
- Returns
A tuple containing the number of instances equals desired label of observation
to be explained the total number of instances sampled, and the result that was sampled
-