Thursday, August 17, 2017

Why crunch likes uncompressed texture data

We've recently gotten some interest in creating a RDO compressor specifically for already compressed textures, which is why I'm writing this.

crunch works best with (and is designed for) uncompressed RGBA texture data. You can feed crunch already compressed data (by compressing to DXT, unpacking the blocks, and throwing the unpacked pixels into the compressor), but it won't perform as well. Why you ask?

crunch uses top down clusterization on the block endpoints. It tries to create groups of blocks that share similar endpoints. Once it finds a group of blocks that seem similar enough, it then uses its DXT endpoint optimizers on these block clusters to create the near-optimal set of endpoints for that cluster. These clusters can be very big, which is why crunch/Basis can't use off the self DXT/ETC compressors which assume 4x4 blocks.

DXT/ETC are lossy formats, so there is no single "correct" encoding for each input (ignoring trivial inputs like solid-color blocks). There are many possible valid encodings that will look very similar. Because of this, creating a good DXT/ETC block encoder that also performs fast is harder than it looks, and adding additional constraints or requirements on top of this (such as rate distortion optimization on both the endpoints and the selectors) just adds to the fun.

Anyhow, imagine the data has already been compressed, and the encoder creates a cluster containing just a single block. Because the data has already been compressed, the encoder now has the job of determining exactly which endpoints were used originally to pack that block. crunch tries to do this for DXT1 blocks, but it doesn't always succeed. There are many DXT compressors out there, each using different algorithms. (crunch could be modified to also accept the precompressed DXT data itself, which would allow it to shortcut this problem.)

What if the original compressor decided to use less than 4 colors spaced along the colorspace line? Also, the exact method used to interpolate the endpoints colors is only loosely defined for DXT1. It's a totally solvable problem, but it's not something I had the time to work on while writing crunch.

Things get worse if the endpoint clusterization step assigns 2+ blocks with different endpoints to the same cluster. The compressor now has to find a single set of endpoints to represent both blocks. Because the input pixels have already been compressed, we're now forcing the input pixels to lie along a quantized colorspace line (using 555/565 endpoints!) two times in a row. Quality takes a nosedive.

Basis improves this situation, although I still favor working with uncompressed texture data because that's what the majority of our customers work with.

Another option is to use bottom-up clusterization (which crunch doesn't use). You first compress the input data to DXT/ETC/etc., then merge similar blocks together so they share the same endpoints and/or selectors. This approach seems to be a natural fit to already compressed data. Quantizing just the selector data is the easiest thing to do first.