So far, it looks possible to unify a very strong subset of BC7 and ASTC 4x4. Such an encoder would be very useful, even if it didn't support rate distortion optimization. I've been looking at this problem on and off for months, and I'm convinced that there's something really interesting here.
First, it turns out there are 30 2-subset partition patterns in common between ASTC 4x4 and BC7:
This is a collection of very strong patterns. Considering ASTC's partition pattern generator is basically a fancy rand() function, this is surprisingly good! We've got almost half of BC7's 64 patterns in there!
Secondly, the way ASTC and BC7 convert indices (weights) to 6-bit scales, and the way they both interpolate endpoints is extremely similar. So similar that I believe ASTC indices could be converted directly to BC7's with little to no expensive per-pixel CPU work required.
Finally, I wrote a little app that examines hundreds of valid ASTC configurations, trying to find which configurations resemble the strongest and most useful BC7 modes. Here they are:
So basically, all the important things between BC7 and ASTC 4x4 are basically the same or similar enough. ASTC's endpoint ranges are all over the map, but that's fine because in most cases BC7's endpoint precision is actually higher than ASTC's. A unified encoder could just optimize for lowest error across *both* formats simultaneously, and output both ASTC and BC7 endpoints with the same selectors. Or, we could only output ASTC's endpoints and just scale them to BC7's.
The next step is to write an ASTC encoder that is limited to these 12 configs and see how strong it is. After this, I need to see if this ASTC texture data can be directly and quickly converted to BC7 texture data with little loss. (Without any recompression to BC7, i.e. we just convert and copy over the endpoints/partition table index/per-pixel selector indices and that's it.) So far, I think this is all possible.
If all this actually works, it'll give us the foundation we need to build the next version of .basis. We could quantize the unified ASTC/BC7 selector/endpoint data and store it in a ".basis2" file. We then can convert this high-quality texture data to the other formats using fast block encoders (like we do already with PVRTC1 RGB/RGBA and PVRTC2 RGBA).
We could even store hints in the .basis2 file that accelerate conversion to some formats. For example we could store optimized BC1 endpoints in the endpoint codebook. Or we could store the optimal ETC1 base color/table indices, etc. Determining the per-pixel selectors for these formats is cheap once you have this info.
I think that with a strong ASTC 4x4 12-mode encoder that supports perceptual colorspace metrics, we could actually beat (or get really close) to ispc_texcomp BC7's encoder (which only supports linear RGB metrics). I think this encoder would get within a few dB of max achievable BC7.
If the system's quality isn't high enough, we could always tack on more ASTC modes, as long as they can be easily transcoded to one of the BC7 modes without expensive operations.
It's too bad that BC7 isn't well supported in WebGL yet. The extensions are there, but the browser support isn't yet. I have no idea why as the format is basically ubiquitous on desktop GPU's now, and it's the highest quality LDR texture format available. For WebGL we still need very strong BC1-5 support for desktops until this situation changes.