Residual model from scratch with Tensorflow.js [Part 2]
In the Part 1, I described how to design a custom network in the tfjs. Now it’s time to make it production-ready.
First, let’s get take a look at the ResNet topology and try to understand what to do and how to proceed.
These are examples of ResNet architecture. If you understand this picture and have a clear vision of how to implement it, then you are pro in the ML, in this article, I will not be able to tell you something you don’t know and you can stop reading here :)
If you are still reading, you might be asking, where the residual connections? To answer that I’ve got another picture of the ResNet-18
You might notice the same numbers here as under the “18-layers” column in the previous table. Moreover, here we see arrows on the right side that represent the residual connection. At the end of the article, I will show how it looks like in the code.
Here you see mysterious symbols like 64, /2 — first I didn’t get what that means. Also I was curious how downsampling is happening here, when I started with the Machine Learning back in 2016 we used MaxPool and AvgPool for that like here:
The answer to both questions was the following: instead of pooling it’s suggested to use stride=2 in the first layer of the every residual block, and this is exactly what /2 means. Cool, huh?
Ok, now we know what convolutional layers we need. But it’s not enough for the production-ready model. You’ve probably already heard about techniques like Batch Normalisation, Dropouts, and ReLU. And there is a picture answering the question of what combinations of them you might want to use in the residual network, here we go:
Note, that you first apply ReLU and only then sum matrixes. Otherwise, the result of ReLU will me a matrix of ones :)
Just imagine, [1,1,1,1,1,1,1…] and your model consistently gives 50% accuracy. Stable is good, but here it’s like Russian stable, so you probably don’t want it. That’s why, once again, the rule is that ReLU first, residual connection after it.
And here we put all together in the code: