Batch Normalization

Implementing batch normalization seems to be advanced tensorflow programming. We'll go over our current implementation below.

Train/Test behavior

Typically want the following behavior during train/test

train:
- compute and use batch moments
- update exponential moving averages
validation:
- use saved exponential moving average's (ema's) (no updates or computation of batch stastics)
Final/Test:
- Ideally, once model trained, do one more pass through data - replace exponential moving averages with true averages over whole data (I think most people are lazy here, they just use the last ema's)

Computational Graph

Have to understand the tensorflow computational graph to some degree. If this was just python with numpy arrays instead of Tensorflow tensors, you could do something like this in the code

...

the training op, to minimize the optimizer,
the predict op, to make predictions with a trained model

Boolean Logic and the Computational Graph

We don't get to write Python if/else boolean logic that gets executed during the training ops, all we can do is add tensorflow ops to the computational graph. Tensorflow has a

...

op that one can use instead. Here, one can use a tensorflow placeholder for the bool, and then feed in True or False when running the training or prediciton operations through the session.

Control Dependencies

One way to do batch normalization is to make the training op depend on, or trigger, the updates to the running mean, while using the batch mean for the computation. Creating the op to update the running_mean like above, but not tieing it to the training training, will mean it never gets executed. Following some stackoverflow threads, I have tried this, using the tensorflow 'control_dependencies' but have not yet got it to work - we can look at the broken code if we like: bn_broke_BatchNormalization.py things to note

Trying to emulate keras BatchNormalization 'modes', but actually, I mixed up mode 1 and 3 (keras mode 1 is what I called mode 3)
The problem is that in test mode, the running mean and std still get updated. Here is tensorflow documentation on control dependencies
note the use of tf.control_dependencies(boolVar, callableA, callableB) in the UseBatchAndUpdateAvg class - it is tieing this 'assign' ops for the running mean/stddev to an op used during training

Adding Ops to Training

Another way to do this is

...

If you look at stackoverflow posts (links below), you see there is a better, more automatic way to do some of this, there appears to be a training group of ops, and you should be able to add the running mean/std updates into this group by accessing the default computational graph - then you can just run the normal training op of minimizing the optimizer. The computational graph is a sort of global variable that is in scope - this appears to be the real tensorflow way to do this, not there yet!

Code

...

.

...

use of a new boolean flag placeholder
model class keeps list of instances of the batch normalization classes
model returns trainOps by querying all the batch normalization instances
trainOps are added to ops used during training

Stack overflow activity:

Saving Restoring Models in TensorFlow

This code also saves/restores the model like we were doing beforeSave/Restore is straightforward, documented better in tensorflow. With TensorFlow, steps seem to be

create the session
creating the op to initializing all the variables
create the tf.Saver() object, we'll call it saver
run the init op
- saver automatically ties ops to computational graph to save variables
call saver.save()
For restoring, between creating the
- create initialize variables op
and running that op,
- create saver
- run init op
- call saver.restore()

Stack overflow activity:

Code

A BatchNormalization class that keeps track of additional training ops: BatchNormalization.py
Driver program: ex06_tf_batchnorm.py. Note:
- use of a new boolean flag placeholder
- model class keeps list of instances of the batch normalization classes
- model returns trainOps by querying all the batch normalization instances
- trainOps are added to ops used during training
how-could-i-use-batch-normalization-in-tensorflow
implementing-batch-normalization-with-tensorflow

Space shortcuts

Page tree

Versions Compared

Old Version 4

New Version Current

Key

Batch Normalization

Train/Test behavior

Computational Graph

Boolean Logic and the Computational Graph

Control Dependencies

Adding Ops to Training

Code

Stack overflow activity:

Saving Restoring Models in TensorFlow

Code

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 4

New Version Current

Key

Batch Normalization

Train/Test behavior

Computational Graph

Boolean Logic and the Computational Graph

Control Dependencies

Adding Ops to Training

Code

Stack overflow activity:

Saving Restoring Models in TensorFlow

Code