### Sorting isometric sprites on the GPU

I've been reworking much of the MUTA client for the first two months of 2019. One of the reworks has been to rendering, and amongst other changes to it, rendering now uses hardware sorting for sprites to draw 2D tiles from back to front.

Anyone who's put a little thought into isometric 2D games knows the draw order of tiles and objects is a little bit tricky. I love the isometric look and have worked on a bunch of isometric 2D game prototypes, but in all of them before this, I've done sprite sorting on the CPU. Not so this time.

Starting out, as any modern developer would, I searched the internet for the formula to correctly calculate each tile's depth based on it's position in the visible part of the map. Surprisingly, I didn't find it. Guess isometric 2D games aren't so hot anymore. Anyway, that's kind of the reason I'm making this post.

Here's a descriptive picture of the hypothetical area we are
rendering.

I first attempted to calculate the depth based on the tile's distance from the camera.

int area_height = 2;

for (int i = 0; i < area_width; ++i)

for (int j = 0; j < area_width; ++j)

for (int k = 0; k < area_height; ++k)

float z = (float)(i + j + k) / (float)(area_width + area_width + area_height);

This approach works fine if all objects are of the same height, that is, the
height of a single tile. But we run into problems when that isn't the
case.

Above, the character is being clipped by tiles that are higher than him on the
map when the tiles should actually be rendering behind him.

A different forumula does the job better.

int area_height = 2;

for (int i = 0; i < area_width; ++i)

for (int j = 0; j < area_width; ++j)

for (int k = 0; k < area_height; ++k) {

float row_max = (float)(i + j + 2) / (float)(area_width + area_width);

float row_base = (float)(i + j) / (float)(area_width + area_width);

float row_diff = row_max - row_base;

float z = row_base + row_diff * ((float)k / (float)area_height);

}

And the result looks like follows.

The idea is that since the z value must be between zero and 1, we need a maximum value. The render order we need is that the closer to the camera a "pillar" of tiles is (a pillar being a stack area_height of tiles at an x-y coordinate), the later it gets rendered. How ever, we also need to have depth values for the individual tiles that constitute a pillar so that the higher a tile is, the later it gets rendered. So, we have to find the depth value of the highest tile in the pillar and gradually increase each tile's depth value, coming from bottom to top, until we reach that maximum.

Anyway, thought I'd throw that out there.