Monday, December 7, 2015

The future of GPU texture compression

Google engineers were the first to realize the value of crunch (original site here), my advanced lossy texture compression library and command line toolset for DXTc textures that was recently integrated into Unity 5.x. Here's Brandon Jones at Google describing how crunch can be used in WebGL apps in his article "Saving Bandwidth and Memory with WebGL and Crunch", from the book "HTML5 Game Development Insights".

While I was writing crunch I was only thinking "this is something useful for console and PC game engines". I had no idea it could be useful for web or WebGL apps. Thinking back, I should have sat down and asked myself "what other software technology, or what other parts of the stack, will need to deal with compressed GPU textures"? (One lesson learned: Learn Javascript!)

Anyhow, crunch is an example of the new class of compression solutions opened up by collapsing parts of the traditional game data processing tool chain into single, unified solutions.

So let's go forward and break down some artificial barriers and combine knowledge across a few different problem domains:

- Game development art asset authoring methods
- Game engine build pipelines/data preprocessing
- GPU texture compression
- Data compression
Rate distortion optimization

The following examples are for DXTc, but they apply to other formats like PVRTC/ETC/etc. too. (Of course, many companies have different takes on the pipelines described here. These are just vanilla, general examples. Id Software's megatexturing and Allegorithmic's tech use very different approaches.)

The old way

The previous way of creating DXTc GPU textures was (this example is the Halo Wars model):

1. Artists save texture or image as an uncompressed file (like .TGA or .PNG) from Photoshop etc.

2. We call a command line tool which grinds away to compress the image to the highest quality achievable at a fixed bitrate (the lossy GPU texture compression step). Alternately, we can use the GPU to accelerate the process to near-real time. (Both solve the same problem.)

This is basically a fixed rate lossy texture compressor with a highly constrained standardized output format compatible with GPU texture hardware.

Now we have a .DDS file, stored somewhere in the game's repo.

3. To ship a build, the game's asset bundle or archive system losslessly packs the texture, using LZ4, LZO, Deflate, LZX, LZMA, etc. - this data gets shipped to end users

The Current New Way

The "current" new way is a little less complex (at the high level) because we delete the lossless compression step in stage 3. Step 2 now borrows a "new" concept from the video compression world, Rate Distortion Optimization (RDO), and applies it to GPU texture compression:

1. Artists selects a JPEG-style quality level and saves texture or image as an uncompressed file (like .TGA or .PNG) from Photoshop etc.

2. We call a command line tool called "crunch" that combines lossy clusterized DXTc compression with VQ, followed by an custom optimizing lossless backend coder. Now we have a .CRN file at some quality level, stored somewhere in the game's repo

3. To ship a build, game's asset bundle or archive system stores the .CRN file uncompressed (because it's already compressed earlier) - this data gets shipped to end users

The most advanced game engines, such as Unity and some other AAA in-house game engines, do it more or less this way now.

The Other New Way (that nobody knows about)

1. Artists hits a "Save" button, and a preview window pops up. Artist can tune various compression options in real-time to find the best balance between lossy compression artifacts and file size. (Why not? It's their art. This is also the standard web way, but with JPEG.) "OK" button saves a .CRN and .PNG file simultaneously.

2. To ship a build, game's asset bundle or archive system stores the .CRN file uncompressed (because it's already been compressed) - this data gets shipped to end users

But step #1 seems impossible right? crunch's compression engine is notoriously slow, even on a 20 core Xeon machines. Most teams build .CRN data in the cloud using hundreds to thousands of machines. I jobified the hell out of crunch's compressor, but it's still very slow.

Internally,  the crunch library has a whole "secret" set of methods and classes that enable this way forward. (Interested? Start looking in the repo in this file here.) 

The Demo of real-time crunch compression

Here's a Windows demo showing crunch-like compression done in real-time. It's approximately 50-100x faster than the command line tool's compression speed. (I still have the source to this demo somewhere, let me know if you would like it released.) 

This demo utilizes the internal classes in crnlib to do all the heavy lifting. All the real code is already public. These classes don't output a .CRN file though, they just output plain .DDS files which are then assumed to be losslessly compressed later. But there's no reason why a fast and simple (non-optimizing) .CRN backend couldn't be tacked on, the core concepts are all the same.

One of the key techniques used to speed up the compression process in the QDXT code demonstrated in this demo is jobified Tree Structured VQ (TSVQ), described here.

GPU texture compression tools: What we really want

The engineers working on GPU texture compression don't always have a full mental model of how texture assets are actually utilized by game makers. Their codecs are typically optimized for either highest possible quality (without taking eons to compress), or they optimize for fastest compression time with minimal to no quality loss (relative to offline compression). These tools ignore the key distribution problems that their customers face completely, and they don't allow artists to control the tradeoff between quality vs. filesize like 25 year old standard formats such as JPEG do.

Good examples of this class of tools:

Intel: Fast ISPC Texture Compressor

NVidia: GPU Accelerated Texture Compression


These are awesome, high quality GPU texture compression tools/libs, with lots of customers. Unfortunately they solve the wrong problem.

What we really want, are libraries and tools that give us additional options that help solve the distribution problem, like rate distortion optimization. (As an extra bonus, we want new GPU texture formats compatible with specialized techniques like VQ/clusterization/etc. But now I'm probably asking for too much.)

The GPU vendors are the best ones to bridge the artificial divides described earlier. This is some very specialized technology, and the GPU format engineers just need to learn more about compression, machine learning, entropy coding, etc. Make sure, when you are designing a new GPU texture format, that you release something like crunch for that format, or it'll be a 2nd class format to your customers.

Now, the distant future

Won Chun (then at Google, now at Rad) came up with a great idea a few years back. What the web and game engine worlds could really use is a "Universal Crunch" format. A GPU agnostic "download anywhere" format, that can be quickly transcoded into any other major format, like DXTc, or PVRTC, or ASTC, etc. Such a texture codec would be quite an undertaking, but I've been thinking about it for years and I think it's possible. Some quality tradeoffs would have to be made, of course, but if you like working on GPU texture compression problems, or want to commercialize something in this space, perhaps go in this direction.


  1. Pretty sure I can see some of the polygons on the Batmobile -- fairly shoddily rendered :P

  2. wouldn't transcoding between dxt/pvrtc/astc cause more artifacts

    also how hard is getting crunch to support astc?
    also isn't astc clearly better than the other two?

  3. :)