Why Do Neural Networks Need the Chain Rule? How do we apply it?

1 5 30
calendar_today agoschedule3 min read
— Originally published at dev.to

Hello, I'm Ganesh. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star git-lrc on GitHub to help more developers discover the project. Do give it a try and share your feedback for improving the product.

In the previous article, we introduced backpropagation and learned that neural networks improve by reducing prediction errors.

We also saw that backpropagation relies on two fundamental ideas:

  1. The Chain Rule
  2. Gradient Descent

But we haven't yet answered an important question:

How does we calculate wieghts and biases to decrease the error?

To answer that, let's look at a very small neural network.

A Simple Neural Network

Imagine a neural network with:
Similar to the previous example.

  • One input neuron
  • Two hidden neurons
  • One output neuron

Calculating Last Bias In the last layer

Let's asssume we have wieght and bias of all hidden layer and we only want to find last bias b3

Now from gradient descent, we can update the last bias b3 using the partial derivative of loss with respect to b3

The Error rate is done with Residuals.
Residual = Observed - Predicted

SSR = Sum of (Observed - Predicted)^2

So, We take 3 samples for training

Starting, Ending and middle values.

Finaly By calculating SSR.

Use of Chain Rule

We actually calculated b3 only using gradient descent.

Now Using chain Value generated from the weight and bias of previous layers

Predicted = Top Layer + Bottom Layer + Bias (b3)

Using Chain Rule we can write Dirivative of SSR with

dssr/db3 = dssr/dpredicted * dpredicted/db3

dssr/dpredicted = (Observed - Predicted)^2

As predicted, it is not constant and we are dirving it.

dssr/dpredicted = 2*(Observed - Predicted)*(d(Observed - Predicted))/dpredicted)

dssr/dpredicted = 2*(Observed - Predicted)(-1)
dssr/dpredicted = -2
(Observed - Predicted)

For dpredicted/db3

dpredicted = Top Layer + Bottom Layer + Bias (b3)
Both Top Layer and Bottom Layer is constant for this calculation
dpredicted/db3 = 1

Finaly dssr/db3 = -2*(Observed - Predicted) * 1

Slop Calculation and Learning

Now we have 3 values of predicted for 3 samples

dssr/db3 = Σ(-2*(Observed-Predicted))

dssr/db3 = -2 * [(Observed1 - Predicted1) * 1 + (Observed2 - Predicted2) * 1 + (Observed3 - Predicted3) * 1]

dssr/db3 = -2 * [(Residual1) + (Residual2) + (Residual3)]

dssr/db3 = -2 * (ResidualSum)

For our training data I got slope = -15.7

step size = slope x learning rate

step size = -15.7 x 0.1 = -1.57

new b3 = old b3 + step size

new b3 = 0 + (-1.57) = -1.57

Then again, recalculating SSR with new b3 we got slop.

slop = -6.26

step size = -6.26 x 0.1 = -0.626

new b3 = -1.57 + (-0.626) = -2.196

Similarly after calculatinng multiple times utile we get step size close to 0.

Final Result
We found the optimal
b3 = 2.21

Conclusion

We could able to apply these chain rule, gradient descent and backpropagation in a very small neural network.

In next article we will discuss how to calculate wieghts and biases in same neural network.

git-lrc

Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.

Star git-lrc on GitHub

🔥 Join developers growing publicly
Share your knowledge, build in public, and grow your developer presence with a global community.

More Posts

Internal Architecture of Neural Networks

Ganesh Kumar - May 30

Breaking the AI Data Bottleneck: How Hammerspace's AI Data Platform Eliminates Migration Nightmares

Tom Smithverified - Mar 16

Your App Feels Smart, So Why Do Users Still Leave?

kajolshah - Feb 2

Understanding Chain Rule

Ganesh Kumar - May 28

Your AI Doesn't Just Write Tests. It Runs Them Too.

Kevin Martinez - May 12
chevron_left
1.3k Points36 Badges
56Posts
5Comments
3Connections
I am tech enthusiast, IoT innovator, software developer.

Commenters (This Week)

2 comments
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!