This page was generated from examples/anchor_text_movie.ipynb.

Anchor explanations for movie sentiment

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

[1]:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # surpressing some transformers' output

import spacy
import string
import numpy as np

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils.download import spacy_model
from alibi.utils.lang_model import DistilbertBaseUncased, BertBaseUncased, RobertaBase

Load movie review dataset

The fetch_movie_sentiment function returns a Bunch object containing the features, the targets and the target names for the dataset.

[2]:

movies = fetch_movie_sentiment()
movies.keys()

[2]:

dict_keys(['data', 'target', 'target_names'])

[3]:

data = movies.data
labels = movies.target
target_names = movies.target_names

Define shuffled training, validation and test set

[4]:

train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=42)
train, val, train_labels, val_labels = train_test_split(train, train_labels, test_size=.1, random_state=42)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)
val_labels = np.array(val_labels)

Apply CountVectorizer to training set

[5]:

vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)

[5]:

CountVectorizer()

Fit model

[6]:

np.random.seed(0)
clf = LogisticRegression(solver='liblinear')
clf.fit(vectorizer.transform(train), train_labels)

[6]:

LogisticRegression(solver='liblinear')

Define prediction function

[7]:

predict_fn = lambda x: clf.predict(vectorizer.transform(x))

Make predictions on train and test sets

[8]:

preds_train = predict_fn(train)
preds_val = predict_fn(val)
preds_test = predict_fn(test)
print('Train accuracy: %.3f' % accuracy_score(train_labels, preds_train))
print('Validation accuracy: %.3f' % accuracy_score(val_labels, preds_val))
print('Test accuracy: %.3f' % accuracy_score(test_labels, preds_test))

Train accuracy: 0.980
Validation accuracy: 0.754
Test accuracy: 0.759

Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

[9]:

model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

Instance to be explained

[10]:

class_names = movies.target_names

# select instance to be explained
text = data[4]
print("* Text: %s" % text)

# compute class prediction
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print("* Prediction: %s" % pred)

* Text: a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .
* Prediction: negative

Initialize anchor text explainer with `unknown` sampling

sampling_strategy='unknown' means we will perturb examples by replacing words with UNKs.

[11]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='unknown',
    nlp=nlp,
)

Explanation

[12]:

explanation = explainer.explain(text, threshold=0.95)

Let us now take a look at the anchor. The word flashy basically guarantees a negative prediction.

[13]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy
Precision: 0.99

Examples where anchor applies and model predicts negative:
a UNK flashy UNK UNK opaque and emotionally vapid exercise in style UNK mystification .
a UNK flashy UNK UNK UNK and emotionally UNK exercise UNK UNK and UNK UNK
a UNK flashy UNK narratively opaque UNK UNK UNK exercise in style and UNK UNK
UNK visually flashy UNK narratively UNK and emotionally UNK UNK UNK UNK UNK mystification .
UNK UNK flashy UNK UNK opaque and emotionally UNK UNK in UNK and UNK .
a visually flashy but UNK UNK and UNK UNK UNK in style UNK mystification .
a visually flashy but UNK opaque UNK emotionally vapid UNK in UNK and mystification .
a UNK flashy but narratively UNK UNK emotionally vapid exercise in style UNK mystification UNK
a UNK flashy but narratively opaque UNK emotionally vapid exercise in style and mystification .
a visually flashy UNK UNK opaque UNK UNK UNK exercise in UNK UNK UNK .

Examples where anchor applies and model predicts positive:
UNK UNK flashy but narratively UNK and UNK UNK UNK in style and UNK UNK

Initialize anchor text explainer with word `similarity` sampling

Let’s try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

[14]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='similarity',   # replace masked words by simialar words
    nlp=nlp,                          # spacy object
    sample_proba=0.5,                 # probability of a word to be masked and replace by as similar word
)

[15]:

explanation = explainer.explain(text, threshold=0.95)

The anchor now shows that we need more to guarantee the negative prediction:

[16]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: exercise AND emotionally
Precision: 0.96

Examples where anchor applies and model predicts negative:
both oddly flashy but eerily opaque and emotionally silly exercise in gown and mystification .
a visually quirky but narratively opaque and emotionally vapid exercise outside style and obfuscation .
a visually artsy but narratively rigid and emotionally caricature exercise without tone and dysphoria .
some unsurprisingly brash but narratively opaque and emotionally vapid exercise into style and inactivity .
any visually pricy but narratively opaque and emotionally unscientific exercise in ballroom and mystification .
a visually flashy but narratively yellowish and emotionally vapid exercise over style and mystification .
any visually mischievous but oddly opaque and emotionally vapid exercise in style and helplessness .
a visually mischievous but narratively opaque and emotionally laughable exercise in faux and mystification .
that visually pricy but narratively reflective and emotionally banal exercise in style and mystification .
this visually shiny but oddly opaque and emotionally vapid exercise in style and mystification .

Examples where anchor applies and model predicts positive:
a perfectly outlandish but tactically opaque and emotionally irrelevant exercise in style and mystification .
a visually outlandish but uniquely responsive and emotionally unconvincing exercise towards style and mystification .
any elegantly elegance but narratively realistic and emotionally vapid exercise in style and mystification .
an vividly flashy but enormously opaque and emotionally litany exercise under style and procrastination .
each visually flowery but narratively movable and emotionally melodramatic exercise towards approach and gluttony .
a suitably goof but narratively useable and emotionally trivial exercise inside style and mystification .
that visually elegance but narratively sensitive and emotionally indiscriminate exercise within style and hysteria .

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the top_n argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the use_proba to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the temperature argument. Lower values of temperature increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to sample_proba. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the top_n words.

[17]:

explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy='similarity',  # replace masked words by simialar words
    nlp=nlp,                         # spacy object
    use_proba=True,                  # sample according to the similiary distribution
    sample_proba=0.5,                # probability of a word to be masked and replace by as similar word
    top_n=20,                        # consider only top 20 words most similar words
    temperature=0.2                  # higher temperature implies more randomness when sampling
)

[18]:

explanation = explainer.explain(text, threshold=0.95)

[19]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: emotionally
Precision: 0.97

Examples where anchor applies and model predicts negative:
this artistically glamorous but narratively opaque and emotionally jumble workout in style and materialism .
an visually gaudy but narratively flexible and emotionally litany exercise through style and mystification .
a graphically flashy but graphically opaque and emotionally vapid hypertrophy of style and mystification .
some classically flashy but artistically opaque and emotionally jumble exercise during style and mystification .
a thematically flashy but graphically opaque and emotionally vapid exercise in style and mystification .
an artistically classy but artistically opaque and emotionally vapid excercise in style and ignorance .
a graphically blocky but narratively opaque and emotionally litany exercise in style and mystification .
a visually chic but narratively transparent and emotionally vapid endurance in charm and mystification .
a visually flashy but stylistically reflective and emotionally vapid exercise into sensibility and mystification .
a visually gaudy but thematically opaque and emotionally incomprehensible p90x in style and mystification .

Examples where anchor applies and model predicts positive:
another visually fancy but narratively soft and emotionally vapid workout inside shape and cowardice .
each visually gaudy but narratively opaque and emotionally boilerplate stress in style and illusory .
that graphically gaudy but graphically translucent and emotionally asinine dieting in style and mystification .

Initialize language model

Because the Language Model is computationally demanding, we can run it on the GPU. Note that this is optional, and we can run the explainer on a non-GPU machine too.

[20]:

# the code runs for non-GPU machines too
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"

We provide support for three transformer-based language models: DistilbertBaseUncased, BertBaseUncased, and RobertaBase. We initialize the language model as follows:

[21]:

# language_model = RobertaBase()
# language_model = BertBaseUncased()
language_model = DistilbertBaseUncased()

Some layers from the model checkpoint at distilbert-base-uncased were not used when initializing TFDistilBertForMaskedLM: ['activation_13']
- This IS expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFDistilBertForMaskedLM were initialized from the model checkpoint at distilbert-base-uncased.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForMaskedLM for predictions without further training.

Initialize anchor text explainer with `language_model` sampling (`parallel` filling)

sampling_strategy='language_model' means that the words will be sampled according to the output distribution predicted by the language model
filling='parallel' means the only one forward pass is performed. The words are the sampled independently of one another.

[22]:

# initialize explainer
explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy="language_model",   # use language model to predict the masked words
    language_model=language_model,        # language model to be used
    filling="parallel",                   # just one pass through the transformer
    sample_proba=0.5,                     # probability of masking a word
    frac_mask_templates=0.1,              # fraction of masking templates (smaller value -> faster, less diverse)
    use_proba=True,                       # use words distribution when sampling (if False sample uniform)
    top_n=20,                             # consider the fist 20 most likely words
    temperature=1.0,                      # higher temperature implies more randomness when sampling
    stopwords=['and', 'a', 'but', 'in'],  # those words will not be sampled
    batch_size_lm=32,                     # language model maximum batch size
)

[23]:

explanation = explainer.explain(text, threshold=0.95)

[24]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: emotionally AND exercise AND flashy AND vapid
Precision: 0.98

Examples where anchor applies and model predicts negative:
a visually flashy but narratively stimulating and emotionally vapid exercise in action and drama.
a fairly flashy but narratively funny and emotionally vapid exercise in imagination and improvisation.
a fairly flashy but narratively driven and emotionally vapid exercise in wit and meditation.
a surprisingly flashy but narratively adventurous and emotionally vapid exercise in logic and politics.
a typically flashy but narratively challenging and emotionally vapid exercise in comedy and meditation.
a seemingly flashy but narratively vivid and emotionally vapid exercise in storytelling and passion.
a highly flashy but narratively detailed and emotionally vapid exercise in action and humor.
a somewhat flashy but narratively accessible and emotionally vapid exercise in spirituality and storytelling.
a relatively flashy but narratively adventurous and emotionally vapid exercise in humor and imagination.
a highly flashy but narratively adventurous and emotionally vapid exercise in storytelling and drama.

Examples where anchor applies and model predicts positive:
a surprisingly flashy but visually entertaining and emotionally vapid exercise in style and mystification.

Initialize anchor text explainer with `language_model` sampling (`autoregressive` filling)

filling='autoregressive' means that the words are sampled one at the time (autoregressive). Thus, following words to be predicted will be conditioned one the previously generated words.
frac_mask_templates=1 in this mode (overwriting it with any other value will not be considered).
This procedure is computationally expensive.

[25]:

# initialize explainer
explainer = AnchorText(
    predictor=predict_fn,
    sampling_strategy="language_model",  # use language model to predict the masked words
    language_model=language_model,       # language model to be used
    filling="autoregressive",            # just one pass through the transformer
    sample_proba=0.5,                    # probability of masking a word
    use_proba=True,                      # use words distribution when sampling (if False sample uniform)
    top_n=20,                            # consider the fist 20 most likely words
    stopwords=['and', 'a', 'but', 'in']  # those words will not be sampled
)

[26]:

explanation = explainer.explain(text, threshold=0.95, batch_size=10, coverage_samples=100)

[27]:

print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation.raw['examples'][-1]['covered_false']]))

Anchor: flashy AND vapid AND exercise AND emotionally
Precision: 1.00

Examples where anchor applies and model predicts negative:
a mostly flashy but narratively structured and emotionally vapid exercise in style and content.
a visually flashy but visually opaque and emotionally vapid exercise in rhythm and action.
a visually flashy but narratively focused and emotionally vapid exercise in meditation and mystification.
a visually flashy but narratively entertaining and emotionally vapid exercise in style and mystification.
a sometimes flashy but narratively opaque and emotionally vapid exercise in style and mystification.
a visually flashy but socially opaque and emotionally vapid exercise in style and spirit.
a fairly flashy but extremely opaque and emotionally vapid exercise in style and mystification.
a visually flashy but narratively opaque and emotionally vapid exercise in meditation and mystification.
a slightly flashy but often spirited and emotionally vapid exercise in style and humour.
a visually flashy but narratively coherent and emotionally vapid exercise in style and imagery.

Examples where anchor applies and model predicts positive:

Anchor explanations for movie sentiment

Load movie review dataset

Apply CountVectorizer to training set

Fit model

Define prediction function

Make predictions on train and test sets

Load spaCy model

Instance to be explained

Initialize anchor text explainer with unknown sampling

Explanation

Initialize anchor text explainer with word similarity sampling

Initialize language model

Initialize anchor text explainer with language_model sampling (parallel filling)

Initialize anchor text explainer with language_model sampling (autoregressive filling)

Initialize anchor text explainer with `unknown` sampling

Initialize anchor text explainer with word `similarity` sampling

Initialize anchor text explainer with `language_model` sampling (`parallel` filling)

Initialize anchor text explainer with `language_model` sampling (`autoregressive` filling)