alibi.explainers.anchor_base module¶

class alibi.explainers.anchor_base.AnchorBaseBeam(samplers, **kwargs)[source]¶

Bases: object

__init__(samplers, **kwargs)[source]¶

Parameters: samplers (List[Callable]) – Objects that can be called with args (result, n_samples) tuple to draw samples.
Return type: None

anchor_beam(delta=0.05, epsilon=0.1, desired_confidence=1.0, beam_size=1, epsilon_stop=0.05, min_samples_start=100, max_anchor_size=None, stop_on_first=False, batch_size=100, coverage_samples=10000, verbose=False, verbose_every=1, **kwargs)[source]¶

Uses the KL-LUCB algorithm (Kaufmann and Kalyanakrishnan, 2013) together with additional sampling to search feature sets (anchors) that guarantee the prediction made by a classifier model. The search is greedy if beam_size=1. Otherwise, at each of the max_anchor_size steps, beam_size solutions are explored. By construction, solutions found have high precision (defined as the expected of number of times the classifier makes the same prediction when queried with the feature subset combined with arbitrary samples drawn from a noise distribution) The algorithm maximises the coverage of the solution found - the frequency of occurrence of records containing the feature subset in set of samples.

Parameters

delta (float) – Used to compute beta.
epsilon (float) – Precision bound tolerance for convergence.
desired_confidence (float) – Desired level of precision (tau in paper).
beam_size (int) – Beam width.
epsilon_stop (float) – Confidence bound margin around desired precision.
min_samples_start (int) – Min number of initial samples.
max_anchor_size (Optional[int]) – Max number of features in result.
stop_on_first (bool) – Stop on first valid result found.
coverage_samples (int) – Number of samples from which to build a coverage set.
batch_size (int) – Number of samples used for an arm evaluation.
verbose (bool) – Whether to print intermediate LUCB & anchor selection output.
verbose_every (int) – Print intermediate output every verbose_every steps.

Return type

dict

Returns

Explanation dictionary containing anchors with metadata like coverage and precision
and examples.

static compute_beta(n_features, t, delta)[source]¶

Parameters

n_features (int) – Number of candidate anchors.
t (int) – Iteration number.
delta (float) –

Return type

float

Returns

Level used to update upper and lower precision bounds.

static dlow_bernoulli(p, level, n_iter=17)[source]¶

Update lower precision bound for a candidate anchors dependent on the KL-divergence.

Parameters

p (ndarray) – Precision of candidate anchors.
level (ndarray) – beta / nb of samples for each result.
n_iter (int) – Number of iterations during lower bound update.

Return type

ndarray

Returns

Updated lower precision bounds array.

draw_samples(anchors, batch_size)[source]¶

Parameters

anchors (list) – Anchors on which samples are conditioned.
batch_size (int) – The number of samples drawn for each result.

Return type

Tuple[tuple, tuple]

Returns

A tuple of positive samples (for which prediction matches desired label)
and a tuple of total number of samples drawn.

static dup_bernoulli(p, level, n_iter=17)[source]¶

Update upper precision bound for a candidate anchors dependent on the KL-divergence.

Parameters

p (ndarray) – Precision of candidate anchors.
level (ndarray) – beta / nb of samples for each result.
n_iter (int) – Number of iterations during lower bound update.

Return type

ndarray

Returns

Updated upper precision bounds array.

get_anchor_metadata(features, success, batch_size=100)[source]¶

Given the features contained in a result, it retrieves metadata such as the precision and coverage of the result and partial anchors and examples where the result/partial anchors apply and yield the same prediction as on the instance to be explained (covered_true) or a different prediction (covered_false).

Parameters

features (tuple) – Sorted indices of features in result.
success – Indicates whether an anchor satisfying precision threshold was met or not.
batch_size (int) – Number of samples among which positive and negative examples for partial anchors are selected if partial anchors have not already been explicitly sampled.

Return type

dict

Returns

Anchor dictionary with result features and additional metadata.
param success:

get_init_stats(anchors, coverages=False)[source]¶

Finds the number of samples already drawn for each result in anchors, their comparisons with the instance to be explained and, optionally, coverage.

Parameters

anchors (list) – Candidate anchors.
coverages – If True, the statistics returned contain the coverage of the specified anchors.

Return type

dict

Returns

Dictionary with lists containing nb of samples used and where sample predictions equal
the desired label.

kllucb(anchors, init_stats, epsilon, delta, batch_size, top_n, verbose=False, verbose_every=1)[source]¶

Implements the KL-LUCB algorithm (Kaufmann and Kalyanakrishnan, 2013).

Parameters

anchors (list) – A list of anchors from which two critical anchors are selected (see Kaufmann and Kalyanakrishnan, 2013).
init_stats (dict) – Dictionary with lists containing nb of samples used and where sample predictions equal the desired label.
epsilon (float) – Precision bound tolerance for convergence.
delta (float) – Used to compute beta.
batch_size (int) – Number of samples.
top_n (int) – Min of beam width size or number of candidate anchors.
verbose (bool) – Whether to print intermediate output.
verbose_every (int) – Whether to print intermediate output every verbose_every steps.

Return type

ndarray

Returns

Indices of best result options. Number of indices equals min of beam width or nb of candidate anchors.

propose_anchors(previous_best)[source]¶

Parameters: previous_best (list) – List with tuples of result candidates.
Return type: list
Returns: List with tuples of candidate anchors with additional metadata.

select_critical_arms(means, ub, lb, n_samples, delta, top_n, t)[source]¶

Determines a set of two anchors by updating the upper bound for low emprical precision anchors and the lower bound for anchors with high empirical precision.

Parameters

means (ndarray) – Empirical mean result precisions.
ub (ndarray) – Upper bound on result precisions.
lb (ndarray) – Lower bound on result precisions.
n_samples (ndarray) – The number of samples drawn for each candidate result.
delta (float) – Confidence budget, candidate anchors have close to optimal precisions with prob. 1 - delta.
top_n (int) – Number of arms to be selected.
t (int) – Iteration number.

Returns

Upper and lower precision bound indices.

static to_sample(means, ubs, lbs, desired_confidence, epsilon_stop)[source]¶

Given an array of mean result precisions and their upper and lower bounds, determines for which anchors more samples need to be drawn in order to estimate the anchors precision with desired_confidence and error tolerance.

Parameters

means (ndarray) – Mean precisions (each element represents a different result).
ubs (ndarray) – Precisions’ upper bounds (each element represents a different result).
lbs (ndarray) – Precisions’ lower bounds (each element represents a different result).
desired_confidence (float) – Desired level of confidence for precision estimation.
epsilon_stop (float) – Tolerance around desired precision.

Returns

Boolean array indicating whether more samples are to be drawn for that particular result.

update_state(covered_true, covered_false, labels, samples, anchor)[source]¶

Updates the explainer state (see __init__ for full state definition).

Parameters

covered_true (ndarray) – Examples where the result applies and the prediction is the same as on the instance to be explained.
covered_false (ndarray) – Examples where the result applies and the prediction is the different to the instance to be explained.
samples (tuple) – A tuple containing discretized data, coverage and the result sampled.
labels (ndarray) – An array indicating whether the prediction on the sample matches the label of the instance to be explained.
anchor (tuple) – The result to be updated.

Return type

Tuple[int, int]

Returns

A tuple containing the number of instances equals desired label of observation
to be explained the total number of instances sampled, and the result that was sampled

class alibi.explainers.anchor_base.DistributedAnchorBaseBeam(samplers, **kwargs)[source]¶

Bases: alibi.explainers.anchor_base.AnchorBaseBeam

draw_samples(anchors, batch_size)[source]¶

Distributes sampling requests among processes running sampling tasks.

Parameters: superclass implementation. (See) –
Return type: Tuple[ndarray, ndarray]
Returns: Same outputs as superclass but of different types.