As a noob on this subject, I have found the following two books very helpful for understanding ANNs;
1) Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition by Sandhya Samarasinghe. - A very nice, intuitive and comprehensible book with lots of illustrations.
2) Practical Neural Network Recipes in C++ by Timothy Masters - A old classic with step-by-step code you can follow.
The second link in TFA for the neural network visualization leads to a heroku app 404 page. Is this a victim of Salesforce's recent policy changes taking away free-tier apps?
Now let's say the expected output was 0.06 instead of 0.04.
What change would you make to w1?
The answer is that you would make w1 = 0.3, because (0.2 * 0.3) = 0.06
Now consider a slightly more complicated situation:
What changes would you make to w1 and w2 to get 0.06 as a result?
You have multiple options here...
- w1 = 0.3 and w2 = 1.0
- w1 = 1.0 and w2 = 0.3
- w1 = 0.5 and w2 = 0.6
- w1 = 0.6 and w2 = 0.5
- many others
As you can see, you have multiple ways of arriving to the same result. How do you go about finding these numbers?
Let's say you try to find w1 and w2 via bruteforce, which is trying every number until you get the right result.
How many attempts you need? A lot, considering you have to try different floating point numbers and there are a lot of possible numbers.
And for a slightly larger network (that's still tiny), it would take just simply too many attempts:
- how far you are from the result (the difference between the output and expected value)
- the sign of that difference
Therefore you know if you have to go higher or lower. Now what you can create a loop where you make make adjustments to the weights so that you go higher or lower depending on what your difference was... great.
It's a little bit like playing golf. You know how hard you have to hit the ball and what direction you have to go. And as you start approaching you start hitting softer.
But how do you do this when you have multiple layers? Well, you have to compute how much each node contributed to the difference in the expected result, by using the values from the weights. Once you do this, you can adjust the weights up or down depending on where you have to go. And do this for each layer, propagating the blame for the differences in result backwards ("backpropagation").
This would be a little bit like playing pool instead of golf, where one ball hits another and then another.
The actual situation is more complicated since you have multiple inputs and expected outputs and you have to work with the same weights for all of those.
The spelled-out intro to neural networks and backpropagation: https://www.youtube.com/watch?v=VMj-3S1tku0
Becoming a backprop ninja: https://www.youtube.com/watch?v=q8SA3rM6ckI