This page was generated from examples/anchor_text_movie.ipynb.

Anchor explanations for movie sentiment

In this example, we will explain why a certain sentence is classified by a logistic regression as having negative or positive sentiment. The logistic regression is trained on negative and positive movie reviews.

[2]:
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import spacy
from alibi.explainers import AnchorText
from alibi.datasets import fetch_movie_sentiment
from alibi.utils.download import spacy_model

Load movie review dataset

The fetch_movie_sentiment function returns a Bunch object containing the features, the targets and the target names for the dataset.

[3]:
movies = fetch_movie_sentiment()
movies.keys()
[3]:
dict_keys(['data', 'target', 'target_names'])
[4]:
data = movies.data
labels = movies.target
target_names = movies.target_names

Define shuffled training, validation and test set

[5]:
train, test, train_labels, test_labels = train_test_split(data, labels, test_size=.2, random_state=42)
train, val, train_labels, val_labels = train_test_split(train, train_labels, test_size=.1, random_state=42)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)
val_labels = np.array(val_labels)

Apply CountVectorizer to training set

[6]:
vectorizer = CountVectorizer(min_df=1)
vectorizer.fit(train)
[6]:
CountVectorizer(analyzer='word', binary=False, decode_error='strict',
                dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
                lowercase=True, max_df=1.0, max_features=None, min_df=1,
                ngram_range=(1, 1), preprocessor=None, stop_words=None,
                strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
                tokenizer=None, vocabulary=None)

Fit model

[7]:
np.random.seed(0)
clf = LogisticRegression(solver='liblinear')
clf.fit(vectorizer.transform(train), train_labels)
[7]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='liblinear', tol=0.0001, verbose=0,
                   warm_start=False)

Define prediction function

[8]:
predict_fn = lambda x: clf.predict(vectorizer.transform(x))

Make predictions on train and test sets

[9]:
preds_train = predict_fn(train)
preds_val = predict_fn(val)
preds_test = predict_fn(test)
print('Train accuracy', accuracy_score(train_labels, preds_train))
print('Validation accuracy', accuracy_score(val_labels, preds_val))
print('Test accuracy', accuracy_score(test_labels, preds_test))
Train accuracy 0.9801624284382905
Validation accuracy 0.7544910179640718
Test accuracy 0.7589841878294202

Load spaCy model

English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

[10]:
model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

Initialize anchor text explainer

[11]:
explainer = AnchorText(nlp, predict_fn)

Explain a prediction

[12]:
class_names = movies.target_names
[13]:
text = data[4]
print(text)
a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .

Prediction:

[14]:
pred = class_names[predict_fn([text])[0]]
alternative =  class_names[1 - predict_fn([text])[0]]
print('Prediction: %s' % pred)
Prediction: negative

Explanation:

[15]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_unk=True)

use_unk=True means we will perturb examples by replacing words with UNKs. Let us now take a look at the anchor. The word ‘exercise’ basically guarantees a negative prediction.

[16]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: exercise
Precision: 0.99

Examples where anchor applies and model predicts negative:
a visually UNK UNK UNK opaque and emotionally vapid exercise UNK UNK and mystification UNK
UNK visually UNK UNK narratively opaque UNK emotionally UNK exercise UNK style UNK mystification .
UNK UNK UNK but UNK UNK and emotionally UNK exercise UNK UNK and UNK UNK
UNK UNK flashy but narratively UNK UNK emotionally vapid exercise UNK style and mystification .
UNK UNK UNK UNK narratively opaque and UNK UNK exercise UNK UNK and UNK .
a UNK flashy UNK UNK UNK and emotionally vapid exercise in UNK and mystification UNK
a visually UNK but UNK opaque UNK UNK vapid exercise UNK style and mystification UNK
a visually UNK but UNK opaque UNK UNK vapid exercise UNK UNK and mystification UNK
a UNK flashy but UNK UNK UNK emotionally UNK exercise in UNK and mystification .
a UNK UNK but UNK opaque UNK UNK UNK exercise UNK UNK UNK mystification .

Examples where anchor applies and model predicts positive:
UNK visually UNK UNK narratively UNK and UNK UNK exercise UNK style and UNK UNK

Changing the perturbation distribution

Let’s try this with another perturbation distribution, namely one that replaces words by similar words instead of UNKs.

Explanation:

[17]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_unk=False, sample_proba=0.5)

The anchor now shows that we need more to guarantee the negative prediction:

[18]:
print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: flashy AND emotionally
Precision: 0.96

Examples where anchor applies and model predicts negative:
a visually flashy but psychologically preferable and emotionally unintelligible action behind style and laziness .
the anatomically flashy but narratively renderer and emotionally silly exercise in type and madness .
a properly flashy but narratively greenish and emotionally ludicrous blueprint although style and anathema .
a impressively flashy but extraordinarily bright and emotionally unfunny exercise in accent and anathema .
a visually flashy but supremely opaque and emotionally vapid exercise except bossa and empiricism .
a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification .
a visually flashy but similarly readable and emotionally vapid exercise beyond practicality and bravado .
both incredibly flashy but narratively greenish and emotionally vapid meditation than style and falsehood .
every visually flashy but gloriously opaque and emotionally vapid exercise in custom and mystification .
a visually flashy but narratively opaque and emotionally vapid exercise of custom and negation .

Examples where anchor applies and model predicts positive:
each visually flashy but masterfully scalable and emotionally snooty exercise into style and nastiness .
both vividly flashy but musically delicacy and emotionally vapid hypertrophy for style and misfortune .
both beautifully flashy but extremely voxel and emotionally unoriginal strenght in choice and immaturity .
a visually flashy but narratively opaque and emotionally pushy vitality to choice and fear .
a easily flashy but unnaturally readable and emotionally vapid gym in beauty and rationality .
each spectacularly flashy but narratively solid and emotionally vapid exercise in character and ego .
each visually flashy but narratively solid and emotionally inoffensive gymnast about style and mystification .
each technologically flashy but narratively bright and emotionally vapid vitality in style and arrogance .
a vividly flashy but narratively delicate and emotionally vapid risk from style and ego .
a visually flashy but delightfully opaque and emotionally vapid crossfit among style and xenophobia .

We can make the token perturbation distribution sample words that are more similar to the ground truth word via the top_n argument. Smaller values (default=100) should result in sentences that are more coherent and thus more in the distribution of natural language which could influence the returned anchor. By setting the use_probability_proba to True, the sampling distribution for perturbed tokens is proportional to the similarity score between the possible perturbations and the original word. We can also put more weight on similar words via the temperature argument. Lower values of temperature increase the sampling weight of more similar words. The following example will perturb tokens in the original sentence with probability equal to sample_proba. The sampling distribution for the perturbed tokens is proportional to the similarity score between the ground truth word and each of the top_n words.

[19]:
np.random.seed(0)
explanation = explainer.explain(text, threshold=0.95, use_similarity_proba=True, sample_proba=0.5,
                                use_unk=False, top_n=20, temperature=.2)

print('Anchor: %s' % (' AND '.join(explanation['names'])))
print('Precision: %.2f' % explanation['precision'])
print('\nExamples where anchor applies and model predicts %s:' % pred)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_true']]))
print('\nExamples where anchor applies and model predicts %s:' % alternative)
print('\n'.join([x[0] for x in explanation['raw']['examples'][-1]['covered_false']]))
Anchor: flashy
Precision: 0.99

Examples where anchor applies and model predicts negative:
this visually flashy but stylistically thin and emotionally vapid exercise into fashion and delusion .
another visually flashy but narratively opaque and intellectually vapid exercise into style and mystification .
another visually flashy but artistically translucent and mentally vapid exercise in style and mystification .
a stylistically flashy but artistically opaque and psychologically vapid exercise in style and selfishness .
a visually flashy but narratively transparent and emotionally vapid exercise to style and mystification .
another functionally flashy but thematically opaque and emotionally vapid exercise in style and delusion .
a artistically flashy but narratively resilient and emotionally classless exercise in retro and delusion .
a concurrently flashy but narratively translucent and emotionally vapid exercise in style and mystification .
a visually flashy but narratively translucent and emotionally vapid exercise across charm and mystification .
a stylistically flashy but aesthetically transparent and emotionally classless exercise near simplicity and mystification .

Examples where anchor applies and model predicts positive:
each visually flashy but narratively opaque and psychologically asinine diet while style and mystification .
any visually flashy but aesthetically transparent and spiritually vapid excercise throughout sensibility and ignorance .
a visually flashy but spiritually opaque and psychologically vacuous excercise into style and selfishness .