Gradient descent – why subtract gradient to update mm and bb
These are the gradient descent formulas: δδm=2n∑−xi(yi−mxi+b)δδm=2n∑−(yi−mxi+b) And my understanding is they come from first taking the positive gradient is the partial derivatives of the function (y−mx+b)2. This leads to δJδm(2x(y−mx+b))δJδb(2×(y−mx+b)×1) Then to get the descent, we just add negatives to each partial derivative. So we are already descending. But translating gradient descent into code, … Read more