debugging keras

Andrea · February 28, 2019, 10:29pm

Hi,

I really appreciate your efforts and persistence, but a minimal reproducible example should must be minimal , reproducible , and still an example of the issue you're having Your current code doesn't have any bug, or at least it runs perfectly on my instance, so it looks like that in the effort to minimize it, you also got rid of whatever bug you were having

Anyway, to avoid further back and forth, I'll give you a list of generic suggestions to "debug" a Deep Neural Network code in Keras. I hope this is what you were looking for. If not, sorry, I tried my best

First of all, one should always fix the random seed to ensure reproducible results, when trying to fix a model which doesn't give satisfactory results, or which doesn't run. As per use_session_with_seed() documentation

ensuring really reproducible results implies that both GPU execution and CPU parallelism will be turned off. This implies that the model fitting will be really slow, so better to test on a small sample of the training set, rather than on the full training set.

Secondly, you would actually need need different strategies to fix a neural network which doesn't train (i.e., cannot reduce training error as much as desired), and one that doesn't generalize (i.e., cannot reduce test error as much as desired). However, for the sake of brevity, I'll only give generic suggestions which should help in both cases.

Unit tests

In the first case, one should start writing unit tests for each function used in the code, see e.g. here (Tensorflow code, but the principle applies to Keras too). Loading the dataset, initializing the weights, defining the architecture, fitting the model, etc.: each of these step should have its own function, and its own unit test(s).

Check the data set

data <- matrix(rnorm(1000 * 32), nrow = 1000, ncol = 32) 
labels <- matrix(rnorm(1000 * 10), nrow = 1000, ncol = 10)

in your example, X (the sample matrix) and y (the labels vector) are random, and you don't have a test set (only a training set) thus there's not much to check. In general, however, you may want to take a (small) random sample of the examples your NN classified correctly, a (small) random sample of the examples which the NN classified incorrectly, and verify that the labels are correct. Sometimes even the best datasets have label noise! Also, check that the normalization of your data set has been done correctly: again, in you example the training set is by definition normalized (or better standardized, in your case), but this is not always the case. Try reshuffling the order in which the training samples are shown to the NN, and see if that affects the training error.

Randomization tests

There are two tests which are very useful to check if there are bugs in your NN: first, train on a single minibatch. The training set error should go to 0 very quickly, and the validation error should quickly go to 100%. The other is to train on the whole dataset, but shuffle the labels. This time, the training set error should slowly reach 0 (if it doesn't, the NN is not able to overfit the training set: bad practice, you should use a bigger NN), and the test set error reach the random chance level, since there's no association anymore between inputs and labels.

Check the initialization

It has become increasingly clear that a large part of the success of neural networks is due to good initialization of the network weights (e.g., [1901.09321] Fixup Initialization: Residual Learning Without Normalization). Thus, you must be sure that the initialization of the weights is correct. Here you can see how to check networks weights before training.

Check individual layers

Tensorflow allows you to visualize the activations of individual layers: this can be incredibly useful to catch buggy units, especially if you're using custom layers. Look here for tutorials on how to use Tensorboard in RStudio:

https://tensorflow.rstudio.com/tools/tensorboard.html

https://tensorflow.rstudio.com/tensorflow/articles/howto_summaries_and_tensorboard.html

Check the effect of regularization

Sometimes regularization can prevent (or slow down) the training loss blowup, thus masking important issues with your code. Thus, it's always good practice to switch off regularization (i.e., comment out layer_batch_normalization, set all layer_dropout rates to 0, set L1/L2 regularization
factor to 0, etc.). and verify that your NN is able to overfit the training set.

Perform numerical experiments, and take note of them

If all else fails, then it's time for the most dreaded and most useful NN "debugging" technique: modify various hyperparameters (learning rate, number or layers, number of units per layer, activation function, etc.) and record the results of each experiment. The package tfruns is your friend here.