This page was generated from examples/anchor_text_movie.ipynb.

Anchor explanations for movie sentiment

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

[2]:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)  # suppress deprecation messages
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import spacy
from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils.download import spacy_model

Load movie review dataset

The fetch_movie_sentiment function returns a Bunch object containing the features, the targets and the target names for the dataset.

[2]:
movies = fetch_movie_sentiment()
movies.keys()
[2]:
dict_keys(['data', 'target', 'target_names'])
[3]:
data = movies.data
labels = movies.target
target_names = movies.target_names

Define shuffled training, validation and test set

[4]:
train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=42)
train, val, train_labels, val_labels = train_test_split(train, train_labels, test_size=.1, random_state=42)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)
val_labels = np.array(val_labels)

Apply CountVectorizer to training set

[5]:
vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)
[5]:
CountVectorizer(analyzer='word', binary=False, decode_error='strict',
                dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
                lowercase=True, max_df=1.0, max_features=None, min_df=1,
                ngram_range=(1, 1), preprocessor=None, stop_words=None,
                strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
                tokenizer=None, vocabulary=None)

Fit model

[6]:
np.random.seed(0)
clf = LogisticRegression(solver='liblinear')
clf.fit(vectorizer.transform(train), train_labels)
[6]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='liblinear', tol=0.0001, verbose=0,
                   warm_start=False)

Define prediction function

[7]:
predict_fn = lambda x: clf.predict(vectorizer.transform(x))

Make predictions on train and test sets

[8]:
preds_train = predict_fn(train)
preds_val = predict_fn(val)
preds_test = predict_fn(test)
print('Train accuracy', accuracy_score(train_labels, preds_train))
print('Validation accuracy', accuracy_score(val_labels, preds_val))
print('Test accuracy', accuracy_score(test_labels, preds_test))
Train accuracy 0.9801624284382905
Validation accuracy 0.7544910179640718
Test accuracy 0.7589841878294202

Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

[9]:
model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

Initialize anchor text explainer

[10]:
explainer = AnchorText(nlp, predict_fn)

Explain a prediction

[11]:
class_names = movies.target_names
[12]:
text = data[4]
print(text)
a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .

Prediction:

[13]:
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print('Prediction: %s' % pred)
Prediction: negative

Explanation:

[14]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_unk=True)

use_unk=True means we will perturb examples by replacing words with UNKs. Let us now take a look at the anchor. The word ‘exercise’ basically guarantees a negative prediction.

[16]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: exercise
Precision: 0.99

Examples where anchor applies and model predicts negative:
a UNK flashy UNK UNK UNK and emotionally vapid exercise in UNK and mystification UNK
a visually UNK but UNK opaque UNK UNK vapid exercise UNK style UNK UNK UNK
a UNK flashy UNK UNK UNK UNK UNK UNK exercise UNK UNK UNK mystification .
UNK UNK UNK but narratively opaque UNK UNK vapid exercise UNK style UNK mystification UNK
a UNK flashy UNK UNK UNK UNK UNK UNK exercise UNK style and UNK UNK
a visually flashy UNK narratively opaque and UNK UNK exercise in style and UNK .
a visually flashy UNK narratively opaque UNK emotionally vapid exercise in style UNK UNK UNK
a visually UNK UNK UNK UNK UNK UNK vapid exercise UNK style and mystification UNK
a UNK flashy UNK narratively opaque and UNK UNK exercise UNK UNK and UNK UNK
a visually UNK UNK UNK opaque and UNK vapid exercise in UNK and mystification UNK

Examples where anchor applies and model predicts positive:
UNK visually UNK UNK narratively UNK and UNK UNK exercise UNK style and UNK UNK

Changing the perturbation distribution

Let’s try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

Explanation:

[17]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_unk=False, sample_proba=0.5)

The anchor now shows that we need more to guarantee the negative prediction:

[18]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: exercise AND flashy
Precision: 0.96

Examples where anchor applies and model predicts negative:
a visually flashy but immensely opaque and intensely counterintuitive exercise in style and mirage .
an accurately flashy but narratively black and emotionally counterintuitive exercise in flair and dogmatic .
a fantastically flashy but inexplicably outer and emotionally concocted exercise before style and mystification .
a amazingly flashy but majorly opaque and intimately monotone exercise in style and mystification .
a visually flashy but narratively opaque and consciously minutiae exercise against flowery and badness .
another uniformly flashy but brilliantly opaque and emotionally vapid exercise in oval and mystification .
some visually flashy but exceedingly responsive and emotionally insufferable exercise amidst style and mystification .
a masterfully flashy but stylistically opaque and emotionally vapid exercise of style and mystification .
some suprisingly flashy but suprisingly detachable and severely vapid exercise than style and foolishness .
the digitally flashy but narratively yellow and emotionally vapid exercise in style and mystification .

Examples where anchor applies and model predicts positive:
a visually flashy but oddly consistent and overtly vapid exercise from style and orthodoxy .
an visually flashy but musically intelligible and horrendously untenable exercise in style and mayhem .
an mechanically flashy but innately vivid and similarly illogical exercise in style and wallow .
an tastefully flashy but technologically opaque and similarly shortsighted exercise in style and despair .
the vividly flashy but physically opaque and somehow pushy exercise through style and mystification .
a wonderfully flashy but lovingly straightforward and fiscally vapid exercise in gown and mystification .

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the top_n argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the use_probability_proba to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the temperature argument. Lower values of temperature increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to sample_proba. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the top_n words.

[19]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_similarity_proba=True, sample_proba=0.5,
                                use_unk=False, top_n=20, temperature=.2)

print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: exercise AND flashy
Precision: 0.97

Examples where anchor applies and model predicts negative:
every graphically flashy but aesthetically opaque and strangely vapid exercise until taste and mystification .
the aesthetically flashy but narratively opaque and tactically jumble exercise on type and mystification .
another visually flashy but narratively opaque and ultimately vacuous exercise into style and fear .
every visually flashy but narratively opaque and powerfully insufferable exercise in type and immorality .
a suprisingly flashy but aesthetically translucent and emotionally vapid exercise arround style and hopelessness .
another visually flashy but aesthetically translucent and tactically vapid exercise near way and mystification .
a remarkably flashy but visually opaque and emotionally unfun exercise in flair and mystification .
another visually flashy but brilliantly transparent and intensely monotone exercise in strapless and mystification .
this visually flashy but anatomically opaque and emotionally vapid exercise under style and mystification .
a visually flashy but fantastically opaque and tactically vapid exercise inside style and materialism .

Examples where anchor applies and model predicts positive:
a deliciously flashy but deliciously opaque and fiscally evasive exercise in charm and mystification .
another remarkably flashy but narratively opaque and emotionally monotone exercise inside culture and ignorance .