This page was generated from examples/anchor_text_movie.ipynb.

Anchor explanations for movie sentiment¶

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

[1]:

import os
import spacy
import string
import numpy as np

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils.download import spacy_model
from alibi.utils.lang_model import DistilbertBaseUncased, BertBaseUncased, RobertaBase

Load movie review dataset¶

The fetch_movie_sentiment function returns a Bunch object containing the features, the targets and the target names for the dataset.

[2]:

movies = fetch_movie_sentiment()
movies.keys()

[2]:

dict_keys(['data', 'target', 'target_names'])

[3]:

data = movies.data
labels = movies.target
target_names = movies.target_names

Define shuffled training, validation and test set

[4]:

train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=42)
train, val, train_labels, val_labels = train_test_split(train, train_labels, test_size=.1, random_state=42)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)
val_labels = np.array(val_labels)

Apply CountVectorizer to training set¶

[5]:

vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)

[5]:

CountVectorizer()

Fit model¶

[6]:

np.random.seed(0)
clf = LogisticRegression(solver='liblinear')
clf.fit(vectorizer.transform(train), train_labels)

[6]:

LogisticRegression(solver='liblinear')

Define prediction function¶

[7]:

predict_fn = lambda x: clf.predict(vectorizer.transform(x))

Make predictions on train and test sets¶

[8]:

preds_train = predict_fn(train)
preds_val = predict_fn(val)
preds_test = predict_fn(test)
print('Train accuracy: %.3f' % accuracy_score(train_labels, preds_train))
print('Validation accuracy: %.3f' % accuracy_score(val_labels, preds_val))
print('Test accuracy: %.3f' % accuracy_score(test_labels, preds_test))

Train accuracy: 0.980
Validation accuracy: 0.754
Test accuracy: 0.759

Load spaCy model¶

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

[9]:

model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

Instance to be explained¶

[10]:

class_names = movies.target_names

# select instance to be explained
text = data[4]
print("* Text: %s" % text)

# compute class prediction
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print("* Prediction: %s" % pred)

* Text: a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .
* Prediction: negative

Initialize anchor text explainer with `unknown` sampling¶

sampling_strategy='unknown' means we will perturb examples by replacing words with UNKs.

[11]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='unknown',
    nlp=nlp,
)

Explanation¶

[12]:

explanation = explainer.explain(text, threshold=0.95)

Let us now take a look at the anchor. The word flashy basically guarantees a negative prediction.

[13]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy
Precision: 0.99

Examples where anchor applies and model predicts negative:
a UNK flashy UNK UNK opaque and emotionally vapid exercise in style UNK mystification .
a UNK flashy UNK UNK UNK and emotionally UNK exercise UNK UNK and UNK UNK
a UNK flashy UNK narratively opaque UNK UNK UNK exercise in style and UNK UNK
UNK visually flashy UNK narratively UNK and emotionally UNK UNK UNK UNK UNK mystification .
UNK UNK flashy UNK UNK opaque and emotionally UNK UNK in UNK and UNK .
a visually flashy but UNK UNK and UNK UNK UNK in style UNK mystification .
a visually flashy but UNK opaque UNK emotionally vapid UNK in UNK and mystification .
a UNK flashy but narratively UNK UNK emotionally vapid exercise in style UNK mystification UNK
a UNK flashy but narratively opaque UNK emotionally vapid exercise in style and mystification .
a visually flashy UNK UNK opaque UNK UNK UNK exercise in UNK UNK UNK .

Examples where anchor applies and model predicts positive:
UNK UNK flashy but narratively UNK and UNK UNK UNK in style and UNK UNK

Initialize anchor text explainer with word `similarity` sampling¶

Let’s try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

[14]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='similarity',   # replace masked words by simialar words
    nlp=nlp,                          # spacy object
    sample_proba=0.5,                 # probability of a word to be masked and replace by as similar word
)

[15]:

explanation = explainer.explain(text, threshold=0.95)

The anchor now shows that we need more to guarantee the negative prediction:

[16]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: exercise AND emotionally
Precision: 0.95

Examples where anchor applies and model predicts negative:
a visually ballsy but narratively green and emotionally illogical exercise in style and hysteria .
an accurately flashy but godsend opaque and emotionally illogical exercise in flair and superstition .
a fantastically pretentious but amazingly opaque and emotionally cartoonish exercise to style and mystification .
a stunningly oddball but narratively green and emotionally uninteresting exercise in style and mystification .
a visually flashy but narratively black and emotionally ephemeral exercise among waltz and cowardice .
an importantly stylish but narratively opaque and emotionally vapid exercise in temperament and mystification .
any visually ingenious but philosophically opaque and emotionally insufferable exercise midst style and mystification .
a lovingly gaudy but narratively opaque and emotionally vapid exercise arround style and mystification .
any surprisingly mischievous but suprisingly minimal and emotionally vapid exercise since style and uselessness .
the digitally flashy but masterfully opaque and emotionally vapid exercise in style and mystification .

Examples where anchor applies and model predicts positive:
a visually brash but narratively realistic and emotionally fanciful exercise in jazz and mystification .
an lovingly glamorous but narratively expandable and emotionally extravagant exercise in style and lament .
any precisely spendy but tremendously opaque and emotionally unscientific exercise in culture and mystification .
a excellently tasteless but deliciously realistic and emotionally formulaic exercise after style and mystification .

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the top_n argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the use_proba to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the temperature argument. Lower values of temperature increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to sample_proba. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the top_n words.

[17]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='similarity',  # replace masked words by simialar words
    nlp=nlp,                         # spacy object
    use_proba=True,                  # sample according to the similiary distribution
    sample_proba=0.5,                # probability of a word to be masked and replace by as similar word
    top_n=20,                        # consider only top 20 words most similar words
    temperature=0.2                  # higher temperature implies more randomness when sampling
)

[18]:

explanation = explainer.explain(text, threshold=0.95)

[19]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: emotionally
Precision: 0.95

Examples where anchor applies and model predicts negative:
a visually flashy but narratively opaque and emotionally vapid exercise arround style and mystification .
any visually flashy but culturally translucent and emotionally ludicrous exercise in style and mystification .
a visually flashy but functionally opaque and emotionally vapid isometric near design and mystification .
an visually blocky but stylistically opaque and emotionally vapid exercise in style and immaturity .
a visually snazzy but anatomically opaque and emotionally vacuous cardio in fashion and immorality .
another aesthetically cumbersome but narratively visible and emotionally vapid weightloss arround style and mystification .
a visually gaudy but narratively reflective and emotionally vapid workout in sass and mystification .
another visually flashy but stylistically opaque and emotionally boilerplate training around style and mystification .
a graphically flashy but strikingly preferable and emotionally banal exercise within sensibility and delusion .
a visually flashy but visually visible and emotionally vapid bodybuilding in style and mystification .

Examples where anchor applies and model predicts positive:
a visually gaudy but narratively opaque and emotionally vapid excercise throughout style and delusion .
any visually gaudy but stylistically reflective and emotionally litany exercise inside stylistic and mirage .
a visually gaudy but stylistically opaque and emotionally nonsensical p90x throughout style and mystification .
a visually clunky but visually translucent and emotionally vapid training of style and immaturity .
a visually snazzy but narratively transparent and emotionally litany bodybuilding in style and illusory .
a functionally snazzy but graphically solid and emotionally vapid treadmill in dress and mystification .
a visually snazzy but graphically transparent and emotionally moronic endurance outside style and lunacy .

Initialize language model¶

Because the Language Model is computationally demanding, we can run it on the GPU. Note that this is optional, and we can run the explainer on a non-GPU machine too.

[20]:

# the code runs for non-GPU machines too
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"

We provide support for three transformer-based language models: DistilbertBaseUncased, BertBaseUncased, and RobertaBase. We initialize the language model as follows:

[21]:

# language_model = RobertaBase()
# language_model = BertBaseUncased()
language_model = DistilbertBaseUncased()

Some layers from the model checkpoint at distilbert-base-uncased were not used when initializing TFDistilBertForMaskedLM: ['activation_13']
- This IS expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFDistilBertForMaskedLM were initialized from the model checkpoint at distilbert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForMaskedLM for predictions without further training.

Initialize anchor text explainer with `language_model` sampling (`parallel` filling)¶

sampling_strategy='language_model' means that the words will be sampled according to the output distribution predicted by the language model
filling='parallel' means the only one forward pass is performed. The words are the sampled independently of one another.

[22]:

# initialize explainer
explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy="language_model",   # use language model to predict the masked words
    language_model=language_model,        # language model to be used
    filling="parallel",                   # just one pass through the transformer
    sample_proba=0.5,                     # probability of masking a word
    frac_mask_templates=0.1,              # fraction of masking templates (smaller value -> faster, less diverse)
    use_proba=True,                       # use words distribution when sampling (if False sample uniform)
    top_n=20,                             # consider the fist 20 most likely words
    temperature=1.0,                      # higher temperature implies more randomness when sampling
    stopwords=['and', 'a', 'but', 'in'],  # those words will not be sampled
    batch_size_lm=32,                     # language model maximum batch size
)

[23]:

explanation = explainer.explain(text, threshold=0.95)

WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7f1760325460>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7f1760325460>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
WARNING:tensorflow:The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
WARNING:tensorflow:The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

[24]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: exercise AND emotionally AND flashy
Precision: 0.96

Examples where anchor applies and model predicts negative:
a visually flashy but critically demanding and emotionally demanding exercise in style and mystification.
a visually flashy but ultimately passionate and emotionally demanding exercise in style and mystification.
a visually flashy but socially creative and emotionally challenging exercise in style and mystification.
a visually flashy but emotionally passionate and emotionally stimulating exercise in style and mystification.
a visually flashy but emotionally emotionally and emotionally motivated exercise in style and mystification.
a visually flashy but socially challenging and emotionally inspired exercise in style and mystification.
a visually flashy but emotionally creative and emotionally intense exercise in style and mystification.
a visually flashy but emotionally conscious and emotionally inspired exercise in style and mystification.
a visually flashy but visually mentally and emotionally vapid exercise in fitness and meditation.
a visually flashy but emotionally spirited and emotionally vapid exercise in swimming and gymnastics.

Examples where anchor applies and model predicts positive:
a visually flashy but politically imaginative and emotionally challenging exercise in style and mystification.
a visually flashy but highly provocative and emotionally challenging exercise in style and mystification.
a visually flashy but narratively engaging and emotionally satisfying exercise in style and mystification.
a visually flashy but narratively challenging and emotionally engaging exercise in style and mystification.
a visually flashy but narratively accurate and emotionally engaging exercise in style and mystification.
a visually flashy but narratively challenging and emotionally satisfying exercise in style and mystification.
a visually flashy but narratively realistic and emotionally satisfying exercise in style and mystification.
a visually flashy but visually opaque and emotionally satisfying exercise in beauty and mystification.
a typically flashy but narratively accurate and emotionally stimulating exercise in style and mystification.
a generally flashy but narratively detailed and emotionally demanding exercise in style and mystification.

Initialize anchor text explainer with `language_model` sampling (`autoregressive` filling)¶

filling='autoregressive' means that the words are sampled one at the time (autoregressive). Thus, following words to be predicted will be conditioned one the previously generated words.
frac_mask_templates=1 in this mode (overwriting it with any other value will not be considered).
This procedure is computationally expensive.

[25]:

# initialize explainer
explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy="language_model",  # use language model to predict the masked words
    language_model=language_model,       # language model to be used
    filling="autoregressive",            # just one pass through the transformer
    sample_proba=0.5,                    # probability of masking a word
    use_proba=True,                      # use words distribution when sampling (if False sample uniform)
    top_n=20,                            # consider the fist 20 most likely words
    stopwords=['and', 'a', 'but', 'in']  # those words will not be sampled
)

[26]:

explanation = explainer.explain(text, threshold=0.95, batch_size=10, coverage_samples=100)

[27]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy AND vapid AND exercise AND opaque
Precision: 1.00

Examples where anchor applies and model predicts negative:
a visually flashy but narratively opaque and occasionally vapid exercise in reflection and mystification.
a visually flashy but narratively opaque and sometimes vapid exercise in inquiry and mystification.
a somewhat flashy but somewhat opaque and emotionally vapid exercise in meditation and mystification.
a very flashy but equally opaque and emotionally vapid exercise in style and technique.
a visually flashy but visually opaque and emotionally vapid exercise in style and personality.
a visually flashy but emotionally opaque and emotionally vapid exercise in style and mystification.
a visually flashy but narratively opaque and emotionally vapid exercise in dialogue and storytelling.
a visually flashy but narratively opaque and socially vapid exercise in style and substance.
a somewhat flashy but narratively opaque and frequently vapid exercise in style and substance.
a wholly flashy but narratively opaque and emotionally vapid exercise in experimentation and mystification.

Examples where anchor applies and model predicts positive:

Anchor explanations for movie sentiment¶

Load movie review dataset¶

Apply CountVectorizer to training set¶

Fit model¶

Define prediction function¶

Make predictions on train and test sets¶

Load spaCy model¶

Instance to be explained¶

Initialize anchor text explainer with unknown sampling¶

Explanation¶

Initialize anchor text explainer with word similarity sampling¶

Initialize language model¶

Initialize anchor text explainer with language_model sampling (parallel filling)¶

Initialize anchor text explainer with language_model sampling (autoregressive filling)¶

Initialize anchor text explainer with `unknown` sampling¶

Initialize anchor text explainer with word `similarity` sampling¶

Initialize anchor text explainer with `language_model` sampling (`parallel` filling)¶

Initialize anchor text explainer with `language_model` sampling (`autoregressive` filling)¶