Could anyone explain in simple words (and maybe with an example) what the difference between the gradient and the Jacobian is?

The gradient is a vector with the partial derivatives, right?

**Answer**

These are two particular forms of matrix representation of the derivative of a differentiable function f, used in two cases:

- when f:Rn→R, then for x in Rn, gradx(f):=[∂f∂x1∂f∂x2…∂f∂xn]|x is the matrix 1×n of the linear map Df(x) exprimed from the canonical base of Rn to the canonical base of R (=(1)…). Because in this case this matrix would have only one row, you can think about it as the vector

∇f(x):=(∂f∂x1,∂f∂x2,…,∂f∂xn)|x∈Rn.

This vector ∇f(x) is the unique vector of Rn such that Df(x)(y)=⟨∇f(x),y⟩ for all y∈Rn (see Riesz representation theorem), where ⟨⋅,⋅⟩ is the usual scalar product

⟨(x1,…,xn),(y1,…,yn)⟩=x1y1+⋯+xnyn. - when f:Rn→Rm, then for x in Rn, Jacx(f)=[∂f1∂x1∂f1∂x2…∂f1∂xn∂f2∂x1∂f2∂x2…∂f2∂xn⋮⋮⋮∂fm∂x1∂fm∂x2…∂fm∂xn]|x is the matrix m×n of the linear map Df(x) exprimed from the canonical base of Rn to the canonical base of Rm.

For example, with f:R2→R such as f(x,y)=x2+y you get grad(x,y)(f)=[2x1] (or ∇f(x,y)=(2x,1)) and for f:R2→R2 such as f(x,y)=(x2+y,y3) you get Jac(x,y)(f)=[2x103y2].

**Attribution***Source : Link , Question Author : Math_reald , Answer Author : Paul Wintz*