# Derivation of Projection Matrix

## Perspective

The perspective projection matrix transforms points from EC(Eye Coordinates) to Normalized Device Coordinates(NDC). However, since the perspective projection is not an affine transformation and has non-linearity, two steps are needed to derive the projection matrix: move the origin to infinity and then normalize coordinates. The meaning of the variables is shown in the picture above. Note that $n$ and $f$ are the near and far planes of the camera.

### 1. $x_{nd}$ and $y_{nd}$

First, observe the relationship between $x$-axis and $z$-axis. If moving the origin in EC to $(0 \; 0 \; 1 \; 0)^t$, which is the point at infinity, all the projected lines in EC are parallel to $z_*$ as above. Let $(y_*, -f)$ be the intersection between a projected line and $z = -f$ in EC. By similarity, $\begin{aligned} \frac{y_e}{z_e} = \frac{y_*}{-f} \implies y_* = -f \frac{y_e}{z_e} \end{aligned}$

Note that $z_e$ is a negative value and $-y'_* \leq y_* \leq y'_*$ because $y_*$ is inside the view volume. Considering that $y'_* = f \tan (\text{fovy} / 2)$ and $-1 \leq y_{nd} \leq 1$, $\begin{aligned} y_{nd} = \frac{y_*}{y'_*} = \cfrac{\cot \left( \frac{\text{fovy}}{2} \right) y_e}{-z_e} \end{aligned}$

Similarly, $x_{nd}$ can be also derived. Let $\text{asp}$ be the aspect ratio. Then, $\begin{aligned} \tan \left( \frac{\text{fovx}}{2} \right) = \cfrac{w}{2f}, \quad \tan \left( \frac{\text{fovy}}{2} \right) = \cfrac{h}{2f} \implies \text{asp} = \cfrac{w}{h} = \cfrac{\tan \left( \frac{\text{fovx}}{2} \right)}{\tan \left( \frac{\text{fovy}}{2} \right)} \end{aligned}$

Note that $w$ and $h$ are *scaled* width and height of the camera. It is fine to use these values because these are only for $\text{asp}$. Therefore, $\begin{aligned} x'_* = f \tan \left( \frac{\text{fovx}}{2} \right) = f \text{asp} \tan \left( \frac{\text{fovy}}{2} \right) \implies x_{nd} = \frac{x_*}{x'_*} = \cfrac{\cot \left( \frac{\text{fovy}}{2} \right) x_e}{-\text{asp} \; z_e} \end{aligned}$

### 2. $z_{nd}$

Looking at the previous results, $x_{nd}$ and $y_{nd}$ have $z_e$ in their denominators, which represents non-linearity. So, the derivation of $z_{nd}$ is not that simple. **Changing the viewpoint, this non-linearity in Euclidean space can be viewed as the linearity in projective space.** That is, the following form can be derived. $\begin{aligned} z_{nd} = \alpha + \frac{\beta}{z_e} \end{aligned}$

Now, considering that $[-n, -f]$ in EC is mapped to $[-1, 1]$ in NDC and *the direction of $z$-axis is going to be opposite*, $\begin{aligned} 1 = \alpha + \frac{\beta}{-f}, \quad -1 = \alpha + \frac{\beta}{-n} \implies \alpha = \frac{f + n}{f - n}, \quad \beta = \frac{2nf}{f - n} \end{aligned}$

Therefore, $\begin{aligned} z_{nd} = \frac{f + n}{f - n} + \frac{2nf}{(f - n)z_e} = \frac{-\frac{f + n}{f - n} z_e - \frac{2nf}{f - n}}{-z_e} \end{aligned}$

### 3. Projection Matrix

Observing that $x_{nd}$, $y_{nd}$, and $z_{nd}$ have $-z_e$ in their denomiators, the projection matrix $M$ can be derived as follows. $\begin{aligned} M \left(\begin{array}{c} x_e \\ y_e \\ z_e \\ 1 \end{array}\right) \equiv \left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & 0 & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & 0 & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \left(\begin{array}{c} x_e \\ y_e \\ z_e \\ 1 \end{array}\right) \end{aligned}$

## Frustum

The perspective projection makes the special frustum with symmetric near and far planes. That is, the line $L$ between the origin of EC and the centers of these planes is exactly the z-axis of EC. However, this frustum can be more generalized for $L$ to be not necessarily the z-axis of EC. Note that $L$ always passes through the origin of EC and the near and far planes are perpendicular to the z-axis of EC. This probably *sheared* frustum can be also mapped into NDC like the perspective projection matrix is derived. But the main difference is finding the shearing matrix for $L$ to match the z-axis of EC at first. The matrix shearing in parallel to $xy$-plane, which means $z$ is not changed, looks like $\begin{aligned} \left(\begin{array}{cccc} 1 & 0 & p & 0 \\ 0 & 1 & q & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \left(\begin{array}{c} x \\ y \\ z \\ 1 \end{array}\right) = \left(\begin{array}{c} x + pz \\ y + qz \\ z \\ 1 \end{array}\right) \end{aligned}$

The frustum matrix is calculated with the top $t$, bottom $b$, left $l$, and right $r$ values on the *near* plane. The $t$ is the maximum $y_e$ on the near plane, the $b$ is the minimum $y_e$, the $l$ is the minimum $x_e$, and the $r$ is the maximum $x_e$. Since the center of the near plane should be on the z-axis of EC after shearing, $\begin{aligned} \left(\begin{array}{cccc} 1 & 0 & p & 0 \\ 0 & 1 & q & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \left(\begin{array}{c} \frac{r + l}{2} \\ \frac{t + b}{2} \\ -n \\ 1 \end{array}\right) = \left(\begin{array}{c} \frac{r + l}{2} - pn \\ \frac{t + b}{2} - qn \\ -n \\ 1 \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \\ -n \\ 1 \end{array}\right) \end{aligned}$

So, $p = (r + l) / (2n)$ and $q = (t + b) / (2n)$. Therefore, the frustum matrix is $\begin{aligned} M \left(\begin{array}{cccc} 1 & 0 & \frac{r + l}{2n} & 0 \\ 0 & 1 & \frac{t + b}{2n} & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \end{aligned}$

where $M$ is the perspective projection derived above. $\begin{aligned} &\left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & 0 & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & 0 & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \left(\begin{array}{cccc} 1 & 0 & \frac{r + l}{2n} & 0 \\ 0 & 1 & \frac{t + b}{2n} & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \\\\ = &\left(\begin{array}{cccc} \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} & 0 & \frac{\cot \left( \frac{\text{fovy}}{2} \right) (r + l)}{\text{asp } 2n} & 0 \\ 0 & \cot \left( \frac{\text{fovy}}{2} \right) & \frac{\cot \left( \frac{\text{fovy}}{2} \right) (t + b)}{2n} & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \end{aligned}$

In this case here, $\text{asp}$ is $(r - l) / (t - b)$ because $\text{asp}$ is not changed before and after shearing. Also, $\tan (\text{fovy} / 2)$ is $(t - b) / (2n)$ because $t - b$, the height of the near plane, is not changed before and after shearing. Accordingly, $\begin{aligned} \cot \left( \frac{\text{fovy}}{2} \right) = \frac{2n}{t - b}, \qquad \frac{\cot \left( \frac{\text{fovy}}{2} \right)}{\text{asp}} = \frac{2n}{t - b} \frac{t - b}{r - l} = \frac{2n}{r - l} \end{aligned}$

From this result, it makes the frustum matrix simpler as follows. $\begin{aligned} \left(\begin{array}{cccc} \frac{2n}{r - l} & 0 & \frac{r + l}{r - l} & 0 \\ 0 & \frac{2n}{t - b} & \frac{t + b}{t - b} & 0 \\ 0 & 0 & -\frac{f+n}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right) \end{aligned}$