IBA.tensorflow_v1¶
As IBA restricts the flow of information by adding noise to an intermediate feature map, we have to modify the existing model.
You can add the IBALayer
as a layer directly in your model.
During training, IBALayer
is the identity and only adds noise later to estimate
the relevance values.
For some models, you might not be able to add a layer, for example, when using
pretrained keras models.
In this case, you can use the IBACopy
class. It adds the noise
operation to the graph coping it partially (using tf.import_graph_def
under the hood).
If you have existing code for the
innvestigate package,
the IBACopyInnvestigate
class
implements the innvestigate API.
For examples, see also the notebook directory.
Table: Overview over the classes. (Task) type of task (i.e. regression, classification, unsupervised). (Layer) requires you to add a layer to the explained model. (Copy) copies the tensorflow graph.
Class |
Task |
Layer |
Copy |
Note |
---|---|---|---|---|
Any |
✅ |
❌ |
Recommended |
|
Any |
❌ |
✅ |
Very flexible |
|
Classification |
❌ |
✅ |
Nice API for classification |
-
class
TFWelfordEstimator
(feature_name, graph=None)[source]¶ Bases:
IBA.utils.WelfordEstimator
Estimates the mean and standard derivation. For the algorithm see wikipedia.
- Parameters
feature_name (str) – name of the feature tensor
graph (tf.Graph) – graph which holds the feature tensor. If
None
, uses the default graph.
-
fit
(feed_dict, session=None, run_kwargs={})[source]¶ Estimates the mean and std given the inputs in
feed_dict
.Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.
-
fit_generator
(generator, session=None, progbar=True, run_kwargs={})[source]¶ Estimates the mean and std from the
feed_dict
generator.Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.
-
state_dict
() → dict[source]¶ Returns the estimator internal state. Can be loaded with
load_state_dict()
.Example:
state = estimator.state_dict() with open('estimator_state.pickle', 'wb') as f: pickle.dump(state, f) # load it estimator = TFWelfordEstimator(feature_name=None) with open('estimator_state.pickle', 'rb') as f: state = pickle.load(f) estimator.load_state_dict(state)
-
to_saliency_map
(capacity, shape=None, data_format=None)[source]¶ Converts the layer capacity (in nats) to a saliency map (in bits) of the given shape .
-
get_imagenet_generator
(path, target_size=(256, 256), crop_size=(224, 224), batch_size=50, seed=0, preprocess_input=None, **kwargs)[source]¶ Yields
(image_batch, targets)
from the given imagenet directory.- Parameters
path (str) – ImageNet directory. Must point to the
train
orvalidation
directory.image_size (tuple) – Scale image to this size.
batch_size (int) – Batch size.
seed (int) – Random seed
preprocess_input (function) – model precessing function. Default:
keras.applications.resnet50
which is used by most keras models.
-
model_wo_softmax
(model: keras.engine.training.Model)[source]¶ Creates a new model w/o the final softmax activation.
model
must be a keras model.
-
class
IBALayer
(estimator=None, feature_mean=None, feature_std=None, feature_active=None, batch_size=10, steps=10, beta=10, learning_rate=1, min_std=0.01, smooth_std=1.0, normalize_beta=True, **kwargs)[source]¶ Bases:
keras.engine.base_layer.Layer
A keras layer that can be included in your model. This class should work with any model and does not copy the tensorflow graph. Although it is a keras layer, it should be possible to use it from other libaries. If you cannot alter your model definition, you have to copy the graph (use
IBACopy
orIBACopyInnvestigate
).Example:
model = keras.Sequential() # add some layer model.add(Conv2D(64, 3, 3)) model.add(BatchNorm()) model.add(Activation('relu')) # ... more layers # add iba in between iba = IBALayer() model.add(iba) # ... more layers model.add(Conv2D(64, 3, 3)) model.add(Flatten()) model.add(Dense(10)) # set classification cross-entropy loss target = iba.set_classification_loss(model.output) # estimate the feature mean and std. for imgs, _ in data_generator(): iba.fit({model.input: imgs}) # explain target for image ex_image, ex_target = get_explained_image() saliency_map = iba.analyze({model.input: ex_image, target: ex_target})
- Hyperparamters Paramters:
The informational bottleneck attribution has a few hyperparameters. They most important parameter is the
beta
which controls the trade-off between model loss. Generally,beta = 10
works well. Other hyperparameters are the number of optimizationsteps
. Thelearning_rate
of the optimizer. The smoothing of the feature map and the minimum feature standard derviation. All hyperparamters set in the constructure can be overriden in theanalyze()
method or in theset_default()
method.
- Parameters
estimator (TFWelfordEstimator) – already fitted estimator.
feature_mean – estimated feature mean. Do not provide
feature_mean_std
andestimator
.feature_std – estimated feature std.
feature_active – estimated active neurons. If
feature_active[i] = 0
, the i-th neuron will be set to zero and no information is added for this neuron.batch_size (int) – Default number of samples to average the gradient.
steps (int) – Default number of iterations to optimize.
beta (int) – Default value for trade-off between model loss and information loss.
learning_rate (float) – Default learning rate of the Adam optimizer.
min_std (float) – Default minimum feature standard derivation.
smooth_std (float) – Default smoothing of the lambda parameter. Set to
0
to disable.normalize_beta (bool) – Default flag to devide beta by the nubmer of feature neurons (default:
True
).**kwargs – keras layer kwargs, see
keras.layers.Layer
-
set_default
(batch_size=None, steps=None, beta=None, learning_rate=None, min_std=None, smooth_std=None, normalize_beta=None)[source]¶ Updates the default hyperparamter values.
-
collect
(*var_names)[source]¶ Mark
*var_names
to be collected for the report. Seeavailable_report_variables()
for all variable names.
-
collect_all
()[source]¶ Mark all variables to be collected for the report. If all variables are collected, the optimization can slow down.
-
available_report_variables
()[source]¶ Returns all variables that can be collected for
get_report()
.
-
call
(inputs) → tensorflow.python.framework.ops.Tensor[source]¶ Returns the output tensor. You can enable the restriction of the information flow with:
restrict_flow()
.
-
restrict_flow
(session=None)[source]¶ Context manager to restrict the flow of the layer. Useful to estimate model output when noise is added. If the flow restirction is enabled, you can only call the model with a single sample (batch size = 1).
Example:
capacity = iba.analyze({model.input: x}) # computes logits using all information logits = model.predict(x) with iba.restrict_flow(): # computes logits using only a subset of all information logits_restricted = model.predict(x)
-
set_classification_loss
(logits, optimizer_cls=<class 'tensorflow.python.training.adam.AdamOptimizer'>) → tensorflow.python.framework.ops.Tensor[source]¶ Creates a cross-entropy loss from the logit tensors. Returns the target tensor.
Example:
iba.set_classification_loss(model.output)
You have to ensure that the final layer of
model
does not applies a softmax. For keras models, you can remove a softmax activation usingmodel_wo_softmax()
.
-
set_model_loss
(model_loss, optimizer_cls=<class 'tensorflow.python.training.adam.AdamOptimizer'>)[source]¶ Sets the model loss for the final objective
model_loss + beta * capacity_mean
. When build themodel_loss
, ensure you are using the copied graph.Example:
with iba.copied_session_and_graph_as_default(): iba.get_copied_outputs()
-
fit
(feed_dict, session=None, run_kwargs={})[source]¶ Estimate the feature mean and std from the given feed_dict.
Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.- Parameters
generator – Yields feed_dict with all inputs
n_samples – Stop after
n_samples
session – use this session. If
None
use default session.run_kwargs – additional kwargs to
session.run
.
Example:
# input is a tensorflow placeholder of your model input = tf.placeholder(tf.float32, name='input') X, y = load_data_batch() iba.fit({input: X})
Where
input
is a tensorflow placeholder andX
an input numpy array.
-
fit_generator
(generator, n_samples=5000, progbar=True, session=None, run_kwargs={})[source]¶ Estimates the feature mean and std from the generator.
Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.
-
analyze
(feed_dict, batch_size=None, steps=None, beta=None, learning_rate=None, min_std=None, smooth_std=None, normalize_beta=None, session=None, pass_mask=None, progbar=False) → numpy.ndarray[source]¶ Returns the transmitted information per feature. See
to_saliency_map()
to convert the intermediate capacites to a visual saliency map.- Parameters
feed_dict (dict) – TensorFlow feed_dict providing your model inputs.
batch_size (int) – number of samples to average the gradient.
steps (int) – number of iterations to optimize.
beta (int) – trade-off parameter between model loss and information loss.
learning_rate (float) – Learning rate of the Adam optimizer.
min_std (float) – Minimum feature standard derivation.
smooth_std (float) – Smoothing of the lambda. Set to
0
to disable.normalize_beta (bool) – Devide beta by the nubmer of feature neurons (default:
True
).session (tf.Session) – TensorFlow session to run the optimization.
pass_mask (np.array) – same shape as the feature map.
pass_mask
masks neurons which are always passed to the next layer. No noise is added ifpass_mask == 0
. For example, it might be usefull if a variable lenght sequence is zero-padded.progbar (bool) – Flag to display progressbar.
-
class
IBACopy
(feature, outputs, estimator=None, feature_mean=None, feature_std=None, feature_active=None, graph=None, session=None, copy_session_config=None, batch_size=10, steps=10, beta=10, learning_rate=1, min_std=0.01, smooth_std=1, normalize_beta=True, **keras_kwargs)[source]¶ Bases:
IBA.tensorflow_v1.IBALayer
Injects an IBALayer into an existing model by partially copying the model. IBACopy is useful for pretrained models which model definition you cannot alter. As tensorflow graphs are immutable, this class copies the original graph partially (using
tf.import_graph_def
).Warning
Changes to your model after calling
IBACopy
have no effect on the explanations. You need to callupdate_variables()
to update the variable values. Coping the graph might also require more memory than addingIBALayer
to our model directly. We would recommend to always useIBALayer
if you can add it as a layer to your model.- Parameters
feature (tf.tensor or str) – tensor or name for the feature tensor to replace.
output_names – list of tensors or tensor names for the model outputs. Useful to specify your model loss (see
set_model_loss()
).estimator (TFWelfordEstimator) – use this estimator.
feature_mean_std (tuple) – tuple of estimated feature
(mean, std)
.graph – Graph of the
feature
andoutputs
tensor. IfNone
, then the default graph is used.session – TensorFlow session corresponding to the
feature
tensor. IfNone
, the default session is used.copy_session_config – Session config for the newly created session.
batch_size (int) – Default number of samples to average the gradient.
steps (int) – Default number of iterations to optimize.
beta (int) – Default value for trade-off between model loss and information loss.
learning_rate (float) – Default learning rate of the Adam optimizer.
min_std (float) – Default minimum feature standard derivation.
smooth_std (float) – Default smoothing of the lambda parameter. Set to
0
to disable.normalize_beta (bool) – Default flag to devide beta by the nubmer of feature neurons (default:
True
).**keras_kwargs – layer kwargs, see
keras.layers.Layer
.
-
get_copied_outputs
()[source]¶ Returns the copied model symbolic outputs provided in the
the constructor
.
-
assert_variables_equal
()[source]¶ Asserts that all variables in the original graph and the copied graph have the same value.
-
update_variables
()[source]¶ Copies the variable values from the original graph to the new copied graph. Call this function after you modified your model and want the changes to affect the saliency map.
-
copied_session_and_graph_as_default
()[source]¶ Context manager that sets the copied gragh and session as default.
-
set_classification_loss
()[source]¶ Sets a softmax cross entropy loss. Uses the first
outputs
tensor as logits.
-
analyze
(feature_feed_dict, copy_feed_dict, batch_size=None, steps=None, beta=None, learning_rate=None, min_std=None, smooth_std=None, normalize_beta=None, session=None, pass_mask=None, progbar=False)[source]¶ Returns the saliency map. This method executes an optimization to remove information while retaining a low model loss.
- Parameters
feature_feed_dict (dict) – TensorFlow feed_dict with all inputs to compute the feature map. Placeholders must come from the original graph.
copy_feed_dict (dict) – TensorFlow feed_dict with all inputs to compute the final model output given the disturbed feature map. Placeholders must correspond to the copied graph.
batch_size (int) – number of samples to average the gradient.
steps (int) – number of iterations to optimize.
beta (int) – trade-off parameter between model loss and information loss.
learning_rate (float) – Learning rate of the Adam optimizer.
min_std (float) – Minimum feature standard derivation.
smooth_std (float) – Smoothing of the lambda
normalize_beta (bool) – Devide beta by the nubmer of neurons
session (tf.Session) – TensorFlow session to run the optimization
pass_mask (np.array) – same shape as the feature map.
pass_mask
masks neurons which are always passed to the next layer. No noise is added ifpass_mask == 0
. For example, it might be usefull if a variable lenght sequence is zero-padded.progbar (bool) – Flag to display progressbar.
-
class
IBACopyInnvestigate
(model, neuron_selection_mode='max_activation', feature_name=None, estimator=None, feature_mean=None, feature_std=None, feature_active=None, batch_size=10, steps=10, beta=10.0, learning_rate=1, min_std=0.01, smooth_std=1, normalize_beta=True, session=None, copy_session_config=None, disable_model_checks=False, **keras_kwargs)[source]¶ Bases:
IBA.tensorflow_v1.IBACopy
,IBA.tensorflow_v1._InnvestigateAPI
This analyzer implements the innvestigate API. It is handy, if your have existing code written for the innvestigate package. The innvestigate API has some limitations. It assumes your model is a
keras.Model
and it only works with classification tasks. For more flexibility, see theIBACopy
.Warning
Changes to your model after calling
IBACopyInnvestigate
have no effect on the explanations. You need to callupdate_variables()
to update the variable values. Coping the graph might also require more memory than addingIBALayer
to our model directly. We would recommend to always useIBALayer
if you can add it as a layer to your model.- Hyperparamters Paramters:
The innvestigate API requires that the hyperparameters are set in the constructure. They can be overriden using the
set_default()
method.
- Parameters
model (keras.Model) – the explained model.
neuron_selection_mode (str) – Mode to select the explained neuron. Must be one of
"max_activation"
,"index"
,"all"
.estimator (TFWelfordEstimator) – feature mean and std. estimator.
feature_mean_std (tuple) – tuple of estimated feature
(mean, std)
.batch_size (int) – Default number of samples to average the gradient.
steps (int) – Default number of iterations to optimize.
beta (int) – Default value for trade-off between model loss and information loss.
learning_rate (float) – Default learning rate of the Adam optimizer.
min_std (float) – Default minimum feature standard derivation.
smooth_std (float) – Default smoothing of the lambda parameter. Set to
0
to disable.normalize_beta (bool) – Default flag to devide beta by the nubmer of feature neurons (default:
True
).session – TensorFlow session corresponding to the
model
. IfNone
, the default session is used.copy_session_config (dict) – Session config for the newly created session.
disable_model_checks – Not used by IBA.
**keras_kwargs – layer kwargs, see
keras.layers.Layer
.
-
fit
(X, session=None, run_kwargs={})[source]¶ Estimates the feature mean and std from the samples
X
. Generally, we recommend run the estimation on about 5000 samples.Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.
-
fit_generator
(generator, steps_per_epoch=None, epochs=1, verbose=1, session=None, progbar=True)[source]¶ Estimates the feature mean and std from the generator. Generally, we recommend run the estimation on about 5000 samples.
Warning
Ensure that your model is in eval mode. If you use keras, call
K.set_learning_phase(0)
.- Parameters