Geometrical Meaning about the Gradient

If f:RnRf: \R^n \to \R is differentiable, then f:RnRn\nabla f: \R^n \to \R^n, which is called the gradient of ff, is defined by f(x)=(f(x)x1f(x)xn)\begin{aligned} \nabla f(x) = \left(\begin{array}{c} \frac{\partial f(x)}{\partial x_1} \\ \vdots \\ \frac{\partial f(x)}{\partial x_n} \end{array}\right) \end{aligned}

1. The gradient points to direction ff is increasing.

By Taylor Theorem, for x+sx + s near a given xx, f(x+s)f(x)+f(x)ts\begin{aligned} f(x + s) \approx f(x) + \nabla f(x)^t s \end{aligned}

For maximizing ff, we can choose a good ss, which means xx should be moved to the direction ff is increasing. Note that f(x)ts\nabla f(x)^t s is maximized when ff is maximized. As f(x)ts\nabla f(x)^t s is the inner product of two vectors. f(x)ts=f(x)scosθ\begin{aligned} \nabla f(x)^t s = \left\| \nabla f(x) \right\| \left\| s \right\| \cos\theta \end{aligned}

where θ\theta is the angle between f(x)\nabla f(x) and ss. It is maximized when θ=0\theta = 0. In other words, when f(x)\nabla f(x) and ss have the same direction, it is maximized. Therefore, xx should be moved to f(x)\nabla f(x) direction to locally maximized ff. For example, consider f(x)=x2f(x) = x^2 and f(x,y)=x2+y2f(x, y) = x^2 + y^2 for xx, yRy \in \mathbb{R}. Then their gradients are f(x)=2x\nabla f(x) = 2x and f(x,y)=(2x,2y)t\nabla f(x, y) = (2x, 2y)^t.


Their gradient point to the direction each ff is increasing at the point xx. Moreover, f(x)-\nabla f(x) points to the direction ff is decreasing.

2. The gradient is perpendicular to the tangent plane in terms of an implicit function.

The gradient has the different meaning for explicit and implicit functions

  • The gradient of an explicit function y=f(x)y = f(x) means the tangent vector at xx.
  • The gradient of an implicit function f(x,y)=0f(x, y) = 0 means the normal vector of the tangent plane at (x,y)t(x, y)^t.

For instance, consider f(x,y)=x2y=0f(x, y) = x^2 - y = 0. Then its gradient is f=(2x,1)t\nabla f = (2x, -1)^t. The total derivative of ff is 2xdxdy=02x dx - dy = 0, so ft(dx,dy)t=0\nabla f^t (dx, dy)^t = 0. Since (dx,dy)t(dx, dy)^t is the tangent of ff, f\nabla f is perpendicular to this.


For another example, consider f(x,y,x)=x2+y2z=0f(x, y, x) = x^2 + y^2 - z = 0. Then its gradient is f=(2x,2y,1)t\nabla f = (2x, 2y, -1)^t. The total derivative of ff is 2xdx+2ydydz=02x dx + 2y dy - dz = 0, so ft(dx,dy,dz)t=0\nabla f^t (dx, dy, dz)^t = 0. Since (dx,dy,dz)t(dx, dy, dz)^t is the tangent of ff, f\nabla f is perpendicular to this.



[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.

© 2024. All rights reserved.