We'll take a look at what t-sne does with the codewords produced by vggnet on the xtcav images.
- Grabbed 4000 codeword2
- about 2000 lasing, 2000 no-lasing
- grabbed the t-sne python code from https://lvdmaaten.github.io/tsne/,
- another good reference for t-sne: http://colah.github.io/posts/2014-10-Visualizing-MNIST/
- first eliminated the dead neurons - there were 1023 dead neaurons (at least from those 4000 samples, should check all)
- run tsne with default parameters (perplexity=30.0)
We get this image:
Interesting to see how well separated they are, it is odd to see a couple yellow no-lasing samples show up in the blue lasing area. The patch of blue lasing samples in the no-lasing area - that are so far away, is odd.
Things to investigate
- Different choices for perplexity
- More data
- t-sne with the enPeaksLabels - the 0,1,2,3
- Running t-sne from different points in vgg16 net, ie, try
- both codeword1 and codeword2, the 8192 final values
- just codeword1
- just the output of convolutions
- Run t-sne directly on the input images
- Look up the input images, what do the lasing images in the blue blob that lives with the no-lasing have in common? Recall the 'lasing' was filtered, we have to measure a e1 or e2 in the enPeaksLabel, are those just mislabelings? They have no lasing to notice? Or is there some other structure in them?