Sunday, November 28, 2021

Lena is Retired

As an open source author, I will not assist or waste time implementing support for any new image/video/GPU texture file format that is not fuzzed, or if it uses the "test" image "lena" (or "lenna") for development, testing, statistical analysis, or optimization purposes, and all test images must be legal with clear copyright attribution. This is one of my freedoms as an open source author.

The model herself has requested the public to lose (i.e. delete, remove, or stop using) the image:

Technically, this archaic image is also useless for testing purposes. There is no one image, 5 images, or even 100 images that are useful for testing new image/texture/video codecs. We use thousands of textures and images for testing and optimization purposes. There is no longer any need to focus intensely on a single image during research or while building new software. Unlike the 70's/80's, we all now have easy access to millions of far better images available on the internet, many of them acquired using modern digital photography equipment. On a modern machine using OpenCL we can compress a directory of over six thousand .PNG images/textures in a little over 6 minutes.

This was posted on Twitter in public by Charles Poynton, PhD (HDTV/UHDTV) after I announced this, after a reader posted that this nearly 50 year old drum scan of a 70's halftoned porn mag picture was being used in a project's "test corpus":

And this was posted on Twitter in public by Chris Green ("Half-Life 2" rendering lead):

PS - It's indicative of how warped, certifiable, or completely out of touch many men working in software and the video game industry are that I have received threats (including death threats) over this stance.

Wednesday, February 17, 2021

Average rate-distortion curves for bc7enc_rdo

bc7enc_rdo is now a library that's utilized by the command line tool, which is far simpler now. This makes it trivial to call multiple times to generate large .CSV files.

If you can only choose one set of settings for bc7enc_rdo, choose "-zn -U -u6". (I've set the default BC7 encoding level to 6, not sure that's checked in yet.) I'll be making bc7e.ispc the new default on my next checkin - it's clearly better.

All other settings were the tool's defaults (linear metrics, window size=128 bytes).

My intuition was to limit the BC7 modes, bias the modes/weights/p-bits/etc. That works and is super fast to encode (if you can't afford any RDO post-processing at all), but the end result is lower quality across most of the usable range of bitrates. Just use bc7e.ispc.

128 vs. 1024 window size:

One block window size, one match per block:

Saturday, February 13, 2021

First RDO LDR ASTC 6x6 encodings

This is 6x6 block size, using the ERT in bc7enc_rdo:

Left=Non-RDO, 37.3 dB, 2.933 bits/texel (Deflate) 
Right=RDO lambda=.5, 36.557 dB, 2.399 bpt

Using more aggressive ERT settings, but the same lambda: 

First-ever RDO ASTC encodings

Here are my first-ever RDO LDR ASTC 4x4 encodings. Perhaps they are the first ever for the ASTC texture format: 

5.951 bits/texel, 45.1 dB, 75773 PSNR/bpt 

4.286 bpt, 38.9 dB, 90752 PSNR/bpt

Biased difference:

I used astcenc to generate a .astc file, loaded it into memory, then used the code in ert.cpp/.h with a custom callback that decodes ASTC blocks. All the magic is in the ERT. Here's a match injection histogram - this works: 1477,466,284,382,265,398,199,109,110,87,82,105,193,3843

Another encode at lambda=.5:

These RDO ASTC encodes do not have any ultra-smooth block handling, because it's just something I put together in 15 minutes. If you look at the planet you can see the artifacts that are worse than they should be.

Next are larger blocks.

Friday, February 12, 2021

bc7e.ispc integrated into bc7enc_rdo

bc7e.ispc is a very powerful/fast 8 mode encoder. It supports the entire BC7 format, unlike bc7enc's default encoder. It's 2-3x faster than ispc_texcomp at the same average quality. Now that it's been combined with bc7enc_rdo you can do optional RDO BC7 encoding using this encoder, assuming you can tolerate the slower encode times.

This is now checked into the bc7enc_rdo repo.

Command: bc7enc xmen_1024.png -u6 -U -z1.0 -zc4096 -zm

4.53 bits/texel (Deflate), 37.696 dB PSNR

BC7 mode histogram:
0: 3753
1: 15475
2: 1029
3: 6803
4: 985
5: 2173
6: 35318
7: 0

Updated bc7enc_rdo with improved smooth block handling

The command line tool now detects extremely smooth blocks and encodes them with a significantly higher MSE scale factor. It computes a per-block mask image, filters it, then supplies an array of per-block MSE scale factors to the ERT. -zu disables this. 

The end result is much less significant artifacts on regions containing very smooth blocks (think gradients). This does hurt rate-distortion performance.

(The second image was resampled to 1/4th res for blogger.)

Thursday, February 11, 2021

Dirac video codec authors on Rate-Distortion Optimization

"This description makes RDO sound like a science: in fact it isn't and the reader will be pleased to learn that there is plenty of scope for engineering ad-hoc-ery of all kinds. This is because there are some practical problems in applying the procedure:"

"Perceptual fudge factors are therefore necessary in RDO in all types of coders." "There may be no common measure of distortion. For example: quantising a high-frequency subband is less visually objectionable than quantising a low-frequency subband, in general. So there is no direct comparison with the significance of the distortion produced in one subband with that produced in another. This can be overcome by perceptual weighting.."

In other words: yea it's a bit of a hack.

This is what I've found with RDO texture encoding. If you use the commonly talked about formula (j=D+l*R, optimize for min j) it's totally unusable (you'll get horrible distortion on flat/smooth blocks, which is like 80% of the blocks on some images/textures). Stock MSE doesn't work. You need something else. 

Adding a linear scale to MSE kinda works (that's what bc7enc_rdo does) but you need ridiculous scales for some textures, which ruins R-D performance.

So if you take two RDO texture encoders, benchmark them, and look at just their PSNR's, you are possibly fooling yourself and others. One encoder with higher PSNR (and better R-D performance) may visually look worse than the other. It's part art, not all science.

With bc7enc_rdo, I wanted to open source *something* usable for most textures with out of the box settings, even though I knew that its smooth block handling needed work. Textures with skies like kodim03 are challenging to compress without manually increasing the smooth block factor. kodim23 is less challenging because its background has some noise.

Releasing something open source with decent performance that works OK on most textures is more important than perfection.