Derivation of Projection Matrix

Perspective

ECToNDC

The perspective projection matrix transforms points from EC(Eye Coordinates) to Normalized Device Coordinates(NDC). However, since the perspective projection is not an affine transformation and has non-linearity, two steps are needed to derive the projection matrix: move the origin to infinity and then normalize coordinates. The meaning of the variables is shown in the picture above. Note that nn and ff are the near and far planes of the camera.

1. xndx_{nd} and yndy_{nd}

First, observe the relationship between xx-axis and zz-axis. If moving the origin in EC to (0  0  1  0)t(0 \; 0 \; 1 \; 0)^t, which is the point at infinity, all the projected lines in EC are parallel to zz_* as above. Let (y,f)(y_*, -f) be the intersection between a projected line and z=fz = -f in EC. By similarity, yeze=yf    y=fyeze\begin{aligned} \frac{y_e}{z_e} = \frac{y_*}{-f} \implies y_* = -f \frac{y_e}{z_e} \end{aligned}

Note that zez_e is a negative value and yyy-y'_* \leq y_* \leq y'_* because yy_* is inside the view volume. Considering that y=ftan(fovy/2)y'_* = f \tan (\text{fovy} / 2) and 1ynd1-1 \leq y_{nd} \leq 1, ynd=yy=cot(fovy2)yeze\begin{aligned} y_{nd} = \frac{y_*}{y'_*} = \cfrac{\cot \left( \frac{\text{fovy}}{2} \right) y_e}{-z_e} \end{aligned}

ASP

Similarly, xndx_{nd} can be also derived. Let asp\text{asp} be the aspect ratio. Then, tan(fovx2)=w2f,tan(fovy2)=h2f    asp=wh=tan(fovx2)tan(fovy2)\begin{aligned} \tan \left( \frac{\text{fovx}}{2} \right) = \cfrac{w}{2f}, \quad \tan \left( \frac{\text{fovy}}{2} \right) = \cfrac{h}{2f} \implies \text{asp} = \cfrac{w}{h} = \cfrac{\tan \left( \frac{\text{fovx}}{2} \right)}{\tan \left( \frac{\text{fovy}}{2} \right)} \end{aligned}

Note that ww and hh are scaled width and height of the camera. It is fine to use these values because these are only for asp\text{asp}. Therefore, x=ftan(fovx2)=fasptan(fovy2)    xnd=xx=cot(fovy2)xeasp  ze\begin{aligned} x'_* = f \tan \left( \frac{\text{fovx}}{2} \right) = f \text{asp} \tan \left( \frac{\text{fovy}}{2} \right) \implies x_{nd} = \frac{x_*}{x'_*} = \cfrac{\cot \left( \frac{\text{fovy}}{2} \right) x_e}{-\text{asp} \; z_e} \end{aligned}

2. zndz_{nd}

Looking at the previous results, xndx_{nd} and yndy_{nd} have zez_e in their denominators, which represents non-linearity. So, the derivation of zndz_{nd} is not that simple. Changing the viewpoint, this non-linearity in Euclidean space can be viewed as the linearity in projective space. That is, the following form can be derived. znd=α+βze\begin{aligned} z_{nd} = \alpha + \frac{\beta}{z_e} \end{aligned}

Now, considering that [n,f][-n, -f] in EC is mapped to [1,1][-1, 1] in NDC and the direction of zz-axis is going to be opposite, 1=α+βf,1=α+βn    α=f+nfn,β=2nffn\begin{aligned} 1 = \alpha + \frac{\beta}{-f}, \quad -1 = \alpha + \frac{\beta}{-n} \implies \alpha = \frac{f + n}{f - n}, \quad \beta = \frac{2nf}{f - n} \end{aligned}

Therefore, znd=f+nfn+2nf(fn)ze=f+nfnze2nffnze\begin{aligned} z_{nd} = \frac{f + n}{f - n} + \frac{2nf}{(f - n)z_e} = \frac{-\frac{f + n}{f - n} z_e - \frac{2nf}{f - n}}{-z_e} \end{aligned}

3. Projection Matrix

Observing that xndx_{nd}, yndy_{nd}, and zndz_{nd} have ze-z_e in their denomiators, the projection matrix MM can be derived as follows. M(xeyeze1)(cot(fovy2)asp0000cot(fovy2)0000f+nfn2nffn0010)(xeyeze1)\begin{aligned} M \left(\begin{array}{c} x_e \\ y_e \\ z_e \\ 1 \end{array}\right) \equiv \left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & 0 & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & 0 & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \left(\begin{array}{c} x_e \\ y_e \\ z_e \\ 1 \end{array}\right) \end{aligned}

Frustum

The perspective projection makes the special frustum with symmetric near and far planes. That is, the line LL between the origin of EC and the centers of these planes is exactly the z-axis of EC. However, this frustum can be more generalized for LL to be not necessarily the z-axis of EC. Note that LL always passes through the origin of EC and the near and far planes are perpendicular to the z-axis of EC. This probably sheared frustum can be also mapped into NDC like the perspective projection matrix is derived. But the main difference is finding the shearing matrix for LL to match the z-axis of EC at first. The matrix shearing in parallel to xyxy-plane, which means zz is not changed, looks like (10p001q000100001)(xyz1)=(x+pzy+qzz1)\begin{aligned} \left(\begin{array}{cccc} 1 & 0 & p & 0 \\ 0 & 1 & q & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \left(\begin{array}{c} x \\ y \\ z \\ 1 \end{array}\right) = \left(\begin{array}{c} x + pz \\ y + qz \\ z \\ 1 \end{array}\right) \end{aligned}

The frustum matrix is calculated with the top tt, bottom bb, left ll, and right rr values on the near plane. The tt is the maximum yey_e on the near plane, the bb is the minimum yey_e, the ll is the minimum xex_e, and the rr is the maximum xex_e. Since the center of the near plane should be on the z-axis of EC after shearing, (10p001q000100001)(r+l2t+b2n1)=(r+l2pnt+b2qnn1)=(00n1)\begin{aligned} \left(\begin{array}{cccc} 1 & 0 & p & 0 \\ 0 & 1 & q & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \left(\begin{array}{c} \frac{r + l}{2} \\ \frac{t + b}{2} \\ -n \\ 1 \end{array}\right) = \left(\begin{array}{c} \frac{r + l}{2} - pn \\ \frac{t + b}{2} - qn \\ -n \\ 1 \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \\ -n \\ 1 \end{array}\right) \end{aligned}

So, p=(r+l)/(2n)p = (r + l) / (2n) and q=(t+b)/(2n)q = (t + b) / (2n). Therefore, the frustum matrix is M(10r+l2n001t+b2n000100001)\begin{aligned} M \left(\begin{array}{cccc} 1 & 0 & \frac{r + l}{2n} & 0 \\ 0 & 1 & \frac{t + b}{2n} & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \end{aligned}

where MM is the perspective projection derived above. (cot(fovy2)asp0000cot(fovy2)0000f+nfn2nffn0010)(10r+l2n001t+b2n000100001)=(cot(fovy2)asp0cot(fovy2)(r+l)asp 2n00cot(fovy2)cot(fovy2)(t+b)2n000f+nfn2nffn0010)\begin{aligned} &\left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & 0 & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & 0 & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \left(\begin{array}{cccc} 1 & 0 & \frac{r + l}{2n} & 0 \\ 0 & 1 & \frac{t + b}{2n} & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \\\\ = &\left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & \frac{\cot \left( \frac{\text{fovy}}{2} \right) (r + l)}{\text{asp } 2n} & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & \frac{\cot \left( \frac{\text{fovy}}{2} \right) (t + b)}{2n} & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \end{aligned}

In this case here, asp\text{asp} is (rl)/(tb)(r - l) / (t - b) because asp\text{asp} is not changed before and after shearing. Also, tan(fovy/2)\tan (\text{fovy} / 2) is (tb)/(2n)(t - b) / (2n) because tbt - b, the height of the near plane, is not changed before and after shearing. Accordingly, cot(fovy2)=2ntb,cot(fovy2)asp=2ntbtbrl=2nrl\begin{aligned} \cot \left( \frac{\text{fovy}}{2} \right) = \frac{2n}{t - b}, \qquad \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} = \frac{2n}{t - b} \frac{t - b}{r - l} = \frac{2n}{r - l} \end{aligned}

From this result, it makes the frustum matrix simpler as follows. (2nrl0r+lrl002ntbt+btb000f+nfn2nffn0010)\begin{aligned} \left(\begin{array}{cccc} \frac{2n}{r - l} & 0 & \frac{r + l}{r - l} & 0 \\ 0 & \frac{2n}{t - b} & \frac{t + b}{t - b} & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \end{aligned}

Reference

[1] 3D Graphics Programming Using OpenGL: Introduction


© 2024. All rights reserved.