← Home

A Very Short Story on Coordinate Systems

October 21, 2024 · 茨月

A short story about left-handed/right-handed coordinate systems... and more.

Come, let us go down and confuse their language so they will not understand each other - Genesis 11:7

The Cause

This semester, I am a TA for Introduction to Computer Graphics. Today, I was working on the code framework for assignment 3, "Rasterizing Your First Triangle". For some reasons 1, we do not use any graphics API but instead implement software rasterization ourselves.

The process of implementing software rasterization includes Model, View, and Projection transformations, along with their corresponding matrices. In the slides, we provided the orthographic projection transformation matrix, which includes a translate and a scale:

$$ M_{\text{ortho}} = \begin{pmatrix} \frac{2}{r-l} & 0 & 0 & 0 \\ 0 & \frac{2}{t-b} & 0 & 0 \\ 0 & 0 & \frac{2}{n-f} & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 & -\frac{r+l}{2} \\ 0 & 1 & 0 & -\frac{t+b}{2} \\ 0 & 0 & 1 & -\frac{n+f}{2} \\ 0 & 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \\ 0 & \frac{2}{t-b} & 0 & -\frac{t+b}{t-b} \\ 0 & 0 & \frac{2}{n-f} & -\frac{n+f}{n-f} \\ 0 & 0 & 0 & 1 \end{pmatrix} $$

where \(n\) and \(f\) are the distances to the near and far planes, and \(l\), \(r\), \(b\), and \(t\) represent the left, right, bottom, and top dimensions of the cuboid. It is important to note that, because the common camera model faces the \(-Z\) direction, \(n > f\) and both are negative values—this fact does not align with our intuition that "the smaller the depth, the closer it is," leading to a series of complications.

We use a right-handed coordinate system, shown below:

Right Hand Coordinate System

The Problem

Prof mentioned in the Z-Buffer section that "for simplicity, we assume that \(z\) values are all positive, and the smaller the value, the closer it is to the camera." In OpenGL, DirectX, and Vulkan APIs, the depth values between \([0,1]\) also indicate that the smaller the value, the closer it is to the camera—so the question arises:

I and another TA—who is also my good senior—debated for thirty minutes, discussing whether in our framework \(z\) being smaller means closer or \(z\) being larger means closer, and how to transform \(z\) to be all positive. My view is that \(z\) being larger means closer to the camera is correct because I obtained the correct triangle occlusion results by performing Z-Buffering this way; while Zheng's view is that \(z\) being smaller means closer to the camera is correct because all major Graphics APIs are like this, and I might have made a mistake somewhere.

The Investigation

First, we studied the OpenGL coordinate system—being the most familiar and closest to us, OpenGL has always used a right-handed system, consistent with our framework... but is that really the case?

The Answer

The `Right-handed system` card in this section of LearnOpenGL mentions that OpenGL uses a right-handed system in world coordinates, but uses a left-handed system in NDC space.

Secondly, we referred to Real Time Rendering 4th Edition and found that the Projection Matrix defined there is slightly different from ours...

{{ collapsible( summary="What's the difference?" content="$$ M_{\text{ortho}} = \begin{pmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \ 0 & \frac{2}{t-b} & 0 & -\frac{t+b}{t-b} \ 0 & 0 & \frac{2}{f-n} & -\frac{f+n}{f-n} \ 0 & 0 & 0 & 1 \end{pmatrix} $$ where the far and near planes on the (z) component are inverted, equivalent to a mirroring about the (xOy) plane!

Finally, after sketching once more, we reached the following conclusions:

The Solution

To ensure consistency with the lecture and avoid creating additional trouble for the students, we use

float final_z = 1.f - interpolated_z;

To transform the \(z\) values to the range \([0,2]\) and ensure that nearer values are smaller and farther values are larger.

The Experience

The god of rendering punished the humans building the graphical Tower of Babel by making them choose different coordinate spaces in different APIs, leaving everyone frustrated and apprehensive when dealing with others' code.

Even highly experienced graphics veterans find coordinate systems tricky, let alone students. Wish me luck with the section Q&A.

Some interesting sidenotes

Although not directly related to the content and discussion of this article, reverse Z is an interesting example of using the properties of IEEE 754 to optimize algorithms:

https://tomhultonharrop.com/mathematics/graphics/2023/08/06/reverse-z.html

1

Specifically, it is about creating a unified and distraction-free programming framework that allows students to start implementing algorithms with minimal effort.

JS
Arrow Up