Wednesday, February 17, 2021

Average rate-distortion curves for bc7enc_rdo

bc7enc_rdo is now a library that's utilized by the command line tool, which is far simpler now. This makes it trivial to call multiple times to generate large .CSV files.

If you can only choose one set of settings for bc7enc_rdo, choose "-zn -U -u6". (I've set the default BC7 encoding level to 6, not sure that's checked in yet.) I'll be making bc7e.ispc the new default on my next checkin - it's clearly better.

All other settings were the tool's defaults (linear metrics, window size=128 bytes).

My intuition was to limit the BC7 modes, bias the modes/weights/p-bits/etc. That works and is super fast to encode (if you can't afford any RDO post-processing at all), but the end result is lower quality across most of the usable range of bitrates. Just use bc7e.ispc.

128 vs. 1024 window size:

One block window size, one match per block:

Saturday, February 13, 2021

First RDO LDR ASTC 6x6 encodings

This is 6x6 block size, using the ERT in bc7enc_rdo:

Left=Non-RDO, 37.3 dB, 2.933 bits/texel (Deflate) 
Right=RDO lambda=.5, 36.557 dB, 2.399 bpt

Using more aggressive ERT settings, but the same lambda: 

First-ever RDO ASTC encodings

Here are my first-ever RDO LDR ASTC 4x4 encodings. Perhaps they are the first ever for the ASTC texture format: 

5.951 bits/texel, 45.1 dB, 75773 PSNR/bpt 

4.286 bpt, 38.9 dB, 90752 PSNR/bpt

Biased difference:

I used astcenc to generate a .astc file, loaded it into memory, then used the code in ert.cpp/.h with a custom callback that decodes ASTC blocks. All the magic is in the ERT. Here's a match injection histogram - this works: 1477,466,284,382,265,398,199,109,110,87,82,105,193,3843

Another encode at lambda=.5:

These RDO ASTC encodes do not have any ultra-smooth block handling, because it's just something I put together in 15 minutes. If you look at the planet you can see the artifacts that are worse than they should be.

Next are larger blocks.

Friday, February 12, 2021

bc7e.ispc integrated into bc7enc_rdo

bc7e.ispc is a very powerful/fast 8 mode encoder. It supports the entire BC7 format, unlike bc7enc's default encoder. It's 2-3x faster than ispc_texcomp at the same average quality. Now that it's been combined with bc7enc_rdo you can do optional RDO BC7 encoding using this encoder, assuming you can tolerate the slower encode times.

This is now checked into the bc7enc_rdo repo.

Command: bc7enc xmen_1024.png -u6 -U -z1.0 -zc4096 -zm

4.53 bits/texel (Deflate), 37.696 dB PSNR

BC7 mode histogram:
0: 3753
1: 15475
2: 1029
3: 6803
4: 985
5: 2173
6: 35318
7: 0

Updated bc7enc_rdo with improved smooth block handling

The command line tool now detects extremely smooth blocks and encodes them with a significantly higher MSE scale factor. It computes a per-block mask image, filters it, then supplies an array of per-block MSE scale factors to the ERT. -zu disables this. 

The end result is much less significant artifacts on regions containing very smooth blocks (think gradients). This does hurt rate-distortion performance.

(The second image was resampled to 1/4th res for blogger.)

Thursday, February 11, 2021

Dirac video codec authors on Rate-Distortion Optimization

"This description makes RDO sound like a science: in fact it isn't and the reader will be pleased to learn that there is plenty of scope for engineering ad-hoc-ery of all kinds. This is because there are some practical problems in applying the procedure:"

"Perceptual fudge factors are therefore necessary in RDO in all types of coders." "There may be no common measure of distortion. For example: quantising a high-frequency subband is less visually objectionable than quantising a low-frequency subband, in general. So there is no direct comparison with the significance of the distortion produced in one subband with that produced in another. This can be overcome by perceptual weighting.."

In other words: yea it's a bit of a hack.

This is what I've found with RDO texture encoding. If you use the commonly talked about formula (j=D+l*R, optimize for min j) it's totally unusable (you'll get horrible distortion on flat/smooth blocks, which is like 80% of the blocks on some images/textures). Stock MSE doesn't work. You need something else. 

Adding a linear scale to MSE kinda works (that's what bc7enc_rdo does) but you need ridiculous scales for some textures, which ruins R-D performance.

So if you take two RDO texture encoders, benchmark them, and look at just their PSNR's, you are possibly fooling yourself and others. One encoder with higher PSNR (and better R-D performance) may visually look worse than the other. It's part art, not all science.

With bc7enc_rdo, I wanted to open source *something* usable for most textures with out of the box settings, even though I knew that its smooth block handling needed work. Textures with skies like kodim03 are challenging to compress without manually increasing the smooth block factor. kodim23 is less challenging because its background has some noise.

Releasing something open source with decent performance that works OK on most textures is more important than perfection. 

Wednesday, February 10, 2021

Weighted/biased BC7 encoding for reduced output data entropy (with no slowdowns)

Previous BC7 encoders optimize for maximum quality and entirely ignore (more like externalize) the encoded data they output. Their encoded output is usually uncompressible noise to LZ coders. 

It's easy to modify existing encoders to favor specific BC7 modes, p-bits, or partition patterns. You can also set some modes to always use specific p-bits, or disable the index flag/component rotation features, and/or quantize mode 6's endpoints more coarsely during encoding. 

These changes result in less entropy in the output data, which indirectly increases LZ matches and boosts the effectiveness of entropy coding. More details here. You can't expect much from this method (I've seen 5-10% reductions in compressed output using Deflate), but it's basically "free" meaning it doesn't slow down encoding at all. It may even speed it up. 

Quick test using the bc7enc_rdo tool:

Mode 1+6: 45.295 dB, 7.41 bits/texel (Deflate), .109 secs

Command: "bc7enc kodim23.png"

BC7 mode histogram:
1: 8736
6: 15840

Mode 1+6 reduced entropy mode: 43.479 RGB PSNR, 6.77 bits/texel (Deflate), .107 secs

Command: "bc7enc kodim23.png -e"

BC7 mode histogram:
1: 1970
6: 22606

Difference image (biased by 128,128,128) and grayscale histogram: