We'll print the confusion matrix and accuracy.
Lots of the new code is boiler plate to format the confusion matrix.
To keep training fast, we cut off the validation set to a small number of the 500 samples we read.
Presently we are not going through the validation set in batches, just pass all to model.predict() function
model.predict gives the softmax results - looks like a probability distribution over labels 0,1
Code
Run
This model should train pretty fast
Discuss
What kind of confusion matrix would be ideal, given that not all 'lasing' images lased?
...
Big swings in the accuracy? What might help?
Explore Model
A fun way to explore the model is to embed IPython at some point. Let's do the following, and then run on an interactive node (not on batch)
stop in IPython at some step in training:
Code Block | ||
---|---|---|
| ||
for epoch in range(3):
ex01.shuffle_data(training_X, training_Y)
next_sample_idx = -minibatch_size
for batch in range(batches_per_epoch):
if step==4:
import IPython
IPython.embed()
step += 1 |
Setup up Plotting
execute:
%pylab
Look at model:
model.summary()
output should be:
Code Block |
---|
In [2]: model.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution2d_1 (Convolution2D) (None, 2, 363, 284) 32 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
batchnormalization_1 (BatchNormal(None, 2, 363, 284) 4 convolution2d_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 2, 363, 284) 0 batchnormalization_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 2, 90, 71) 0 activation_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 6, 90, 71) 192 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
batchnormalization_2 (BatchNormal(None, 6, 90, 71) 12 convolution2d_2[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 6, 90, 71) 0 batchnormalization_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 6, 22, 17) 0 activation_2[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 2244) 0 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 40) 89760 flatten_1[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNormal(None, 40) 80 dense_1[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 40) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 10) 400 activation_3[0][0]
____________________________________________________________________________________________________
batchnormalization_4 (BatchNormal(None, 10) 20 dense_2[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 10) 0 batchnormalization_4[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 2) 22 activation_4[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 2) 0 dense_3[0][0]
====================================================================================================
Total params: 90522
____________________________________________________________________________________________________ |
plot the first image and label from the last batch
imshow(X[0,0,:,:], interpolation='none', origin='lower')
Y[0,:]
Plot output of internal layers
following: http://keras.io/getting-started/faq/
do
Code Block | ||
---|---|---|
| ||
from keras import backend as K
conv0 = K.function([model.layers[0].input], [model.layers[0].output])
out0=conv0([X])[0]
out0.shape
Out[15]: (24, 2, 363, 284)
# this is a batch of 24 samples, there are two channels of outputs, 2 feature maps
# take a look at the two channels
imshow(out0[0,0,:,:])
imshow(out0[0,1,:,:])
# take a look at the final output of the convnet layers,
# since we use batch normalization that behaves differently between training and test,
# we must make this an argument and pass a value when we use the function:
l7fn = K.function([model.layers[0].input, K.learning_phase()],[model.layers[7].output])
convout = l7fn([X,False])[0]
convout.shape
Out[24]: (24, 6, 22, 17)
# look at one of the output channels:
imshow(convout[0,0,:,:])
# take a look at the variables so far
weights = model.weights()
# need to match up with model.summary() to see what is what
# kernel for first convolutional layer
weights[0].shape
(2,1,4,4) |
Do control+D when done, to quit IPython, model will keep training
Do
git checkout ex02_keras_train.py
to revert changes
Exercises
Explore more of the model
Try different hyper parameter selection in the model
- learning rate
- optimizer momentum
- minibatch size
- validation size
- relu vs. other