Monday, January 27, 2020

UASTC1 block format encoding, revision 8

Important: This document is now outdated. The latest version is in the Basis Universal wiki, here.

UASTC1 (subsequently referred to as "UASTC" in this doc) is a 19 mode 4x4 pixel LDR-only subset of the ASTC specification with a simpler 128-bit block format. It can be quickly losslessly transcoded to the standard ASTC block format, quickly transcoded to BC7 with very low quality loss (.75 RGB dB PSNR on average), or re-encoded to high quality ETC1, ETC EAC A8, or BC1-5 with a small amount of per-pixel work. There are 9 opaque modes, 1 solid color mode, and 9 alpha modes. These UASTC modes each map to one of 6 BC7 modes (all except 0 and 4). Here's a good high-level overview of ASTC.

UASTC is the first high quality universal (or virtual) block-based texture format that supports block partitioning along with the ability to be efficiently transcoded to multiple GPU texture formats. Transcoding can either be done using the CPU or with a GPU compute shader. Transcoding UASTC to ASTC and BC7 do not involve any pixel-level operations. UASTC's fields directly correspond to ASTC's fields whenever possible.

See the previous post for a high-level description of each UASTC mode and encoder performance/features.

The UASTC block format has hint bits to accelerate transcoding to various texture formats. There are up to two BC1 hint bits per block which direct the UASTC->BC1 transcoder to reuse the UASTC endpoint and/or weight indices (appropriately scaled) for faster real-time compression. On average ~60% of UASTC blocks don't need PCA, and ~30% don't need real-time BC1 encoding at all. There are also hint bits to accelerate transcoding to high quality ETC1 and ETC2 EAC A8.

This post shows how each mode is laid out in a 128-bit UASTC block at the bit level. Bits are written starting from the beginning of the block (at the first byte's LSB) working "down" towards bit 128. The mode field is always first and is stored at bit 0 in the block (bit 0 of byte 0).

This is a snapshot of the current encoding. This may change somewhat over the next few weeks.

Quality-wise, our UASTC encoder is stronger than Intel's ispc_texcomp ASTC encoder:


This is avg. RGBA (not RGB) PSNR across 34 RGB/RGBA test textures. Settings: UASTC encoder "slower" balanced profile, ispc_texcomp "astc_alpha_slow" profile, and for near-opt. BC7 we are using our commercial SIMD BC7 compressor (BC7E) in its "slow" profile (very similar to ispc_texcomp's BC7 "slower" profile).


Potential Future Additions


We may add a 4bpp BC1-like mode to give the encoder more rate distortion options on opaque blocks. This would go against the grain of the current design that requires all modes to be valid ASTC configurations

License


The UASTC specification and block format is explicitly not copyrighted by any entity, and to our knowledge is patent free. It may be used for any purpose whatsoever, including commercial purposes. The author of this work hereby waives all claim of copyright (economic and moral) in this work and immediately places it in the public domain; it may be used, distorted or destroyed in any manner whatsoever without further attribution or notice to the creator.

Visualizations of the Common ASTC/BC7 Partition Pattern Tables

UASTC supports the 2/3-subset partition patterns that BC7 and ASTC have in common. UASTC supports 60 of these common patterns, in three categories (2-subset, 3-subset, and 2-subset ASTC mapped to 3-subset BC7 with two endpoints set to equal vectors).  Here's a visualization of each category:

2-subset (30 patterns):



3-subset (11 patterns):



2 subset ASTC, 3 subset BC7 (19 patterns):




Another visualization of the 2/3-subset patterns, from a BC7 perspective (original pics from here). The patterns supported by UASTC are circled in purple. Note this doesn't include the mixed 2-subset ASTC/3-subset BC7 patterns:




Field Definitions:


Mode: Huffman coded mode index (2-7 bits). The mode index range is [0,19]. One mode index (19) is saved for future expansion. The first bit of the Huffman code is the LSB, which is stored in bit 0 byte 0 of the UASTC block.

The Huffman codes and code lengths for each mode index are:

{ 0x1, 4 }, { 0x35, 6 }, { 0x1D, 5 }, { 0x3, 5 },
{ 0x13, 5 }, { 0xB, 5 }, { 0x1B, 5 }, { 0x7, 5 },
{ 0x17, 5 }, { 0xF, 5 }, { 0x2, 3 }, { 0x0, 2 },
{ 0x6, 3 }, { 0x1F, 5 }, { 0xD, 5 }, { 0x5, 7 },
{ 0x15, 6 }, { 0x25, 6 }, { 0x9, 4 }, { 0x45, 7 } 

The following Huffman decoding acceleration table (created from the list of Huffman codes), when accessed with the lowest 7 bits of the first byte in the UASTC block, contains the corresponding UASTC mode index:

static const uint8_t g_uastc_huff_modes[128] =
{ 
11,0,10,3,11,15,12,7,11,18,10,5,11,14,12,9,11,0,10,4,11,16,12,8,11,18,10,6,11,2,12,13,11,0,10,3,11,17,12,7,11,18,10,5,11,14,12,9,11,0,10,4,11,1,12,8,11,18,10,6,11,2,12,13,11,0,10,3,11,19,12,7,11,18,10,5,11,14,12,9,11,0,10,4,11,16,12,8,11,18,10,6,11,2,12,13,11,0,10,3,11,17,12,7,11,18,10,5,11,14,12,9,11,0,10,4,11,1,12,8,11,18,10,6,11,2,12,13
};

BC1H0, BC1H1: BC1 transcoding acceleration hints:

BC1H0: If set the transcoder can scale the first subset's UASTC endpoints to BC1 (5,6,5) endpoints, and then scale (or copy) the UASTC weight indices to BC1 2-bit weights. This skips the expensive PCA and least squares steps involved in real-time BC1 encoding.

BC1H1: If set the transcoder scales (or copies) the UASTC weight indices to BC1 2-bit weights. Least squares (1 or 2 iterations) can then immediately be used to compute the BC1 endpoints. This skips the expensive PCA step.

All modes (except for solid color) have BC1H0, and most modes have BC1H1.

ETC1F, ETC1D, ETCI0, ETCI1: 8-bits of ETC1 transcode hints (flipped subblocks flag bit, differential encoding flag bit, intensity table index 0, intensity table index 1).

These hints are used by the transcoder to quickly create ETC1 blocks from the unpacked UASTC texels. To use them, the transcoder computes each 4x2 or 2x4 subblock's average color, quantizes them to 555:333 or 444:444 bits, then computes the selectors in luma space. No other work is necessary (because all the hard work was done in the UASTC encoder).

ETC1BIAS: A 5-bit field indicating how to bias each ETC1's subset's computed block color. The encoder chooses the bias field which results in lowest overall ETC1 error. See the ETC1 bias helper function near the end of this document.

ETC2TM: 8-bits of ETC2 EAC A8 transcode hints (4-bit table, 4-bit multiplier)

This is similar to how ETC1 blocks are packed, except these hints are for the alpha portion of ETC2 EAC A8 blocks. These bits are only present in modes 9-14 (the alpha modes).

ETQ: Packed endpoint trits/quints values. A simplified form of BISE is used in UASTC, see:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc-integer-sequence-encoding

See the "UASTC BISE Endpoint Ranges table" below for the # of trits or quints for each endpoint range. Some of the ranges don't have trits or quints, so there will be no ETQ fields.

We store the trits/quints first, followed by each value's bits. The bit interleaving and trit/quint rearranging and preprocessing in section 18.2 aren't used. Instead the encoded trits/quints are stored in UASTC as-is.

For quints, each encoded value is up to 7-bits: quint2*25+quint1*5+quint0, and similar for trits except each encoded value is up to 8-bits. When the number of endpoint values isn't a multiple of 5 or 3 values, the size of the final code is the minimum # of bits necessary to represent the encoded value (to save bits).

EBITS: Endpoint bits (one set of bits per ASTC endpoint value). See the "UASTC BISE Endpoint Ranges table" below for the # of bits for each endpoint range. Endpoint order is the same as ASTC's: RL, RH, GL, GH, BL, BH, etc. Max of 18 values (RGB 3-subsets: 3*2*3).

To retrieve the endpoint values, you extract the trits/quints from the encoded ETQ values, shift each one left the appropriate number of bits (depending on the UASTC mode's endpoint range) and logically OR in the EBITS values.

Endpoint values are a sequence of integers that must be dequantized to [0,255] by following the ASTC spec in section 18.13, see:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc-endpoint-unquantization

WEIGHTS: Encoded weight indices. Just like BC7, the first weight of each subset's "anchor" texel index always has a MSB of 0, so these weights can be encoded with one less bit than the others. (UASTC doesn't use Blue Contraction so we can use this trick.)

Weights are always encoded as plain bits (no BISE necessary). Weight ordering is the same as ASTC's (raster order, left to right/top to bottom scanline). In dual plane mode, the ordering is also ASTC's: p0 p1, p0 p1, p0 p1, etc. (two weight indices per texel).

The weights are dequantized to 6-bit interpolation values in the same way as ASTC's:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#_weight_unquantization

And the endpoints are interpolated in the same way as ASTC's:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc_weight_application

PAT: Index into the common BC7/ASTC partition pattern table. This table contains BC7 pattern indices, ASTC pattern seeds, and permutation/flip flags which indicate how to map ASTC pattern subset indices to BC7's. There are three tables and 60 total partition patterns.

A UASTC decoder can either use ASTC's partition pattern generator or BC7's partition tables. To map ASTC's partition patterns to BC7's, the pattern subset indices are either used as-is, inverted, permuted, and/or combined to get BC7 partition pattern subset indices (see the tables/example code at the very bottom). These simple transformations correspond to changing the order of the encoded BC7 endpoints, or setting 2 endpoints in a 3-subset BC7 block to the same color/alpha values. Every ASTC pattern included in the below common tables maps to a BC7 pattern without loss (i.e. there is no subset "crosstalk" when mapping a UASTC to a BC7 pattern).

COMPSEL: ASTC's Color Component Selector (CCS) field. Only present in Dual Plane modes.
This maps to BC7 mode 5's 2-bit component rotation field. The CCS value must be remapped and the endpoint RGBA components reordered when transcoding to BC7.

ASTC and BC7 handle dual plane mode component rotations slightly differently. In ASTC, if the CCS field is 0 for red, the red component still appears in its usual position in the endpoint values, and the decoder uses the 2nd plane to separately interpolate the red (or green, or blue, etc.) channel. In BC7, if the component rotation field is 1 (red), the red component is swapped with alpha at encode time, the alpha component is always interpolated with the 2nd plane's weight indices by the decoder, and the components are then swapped after endpoint interpolation. To losslessly transcode ASTC dual plane modes to BC7, you have to swap the appropriate ASTC endpoint channel with the alpha channel.

Other notes:


- The number of color components is 2 for modes [15-17], 3 for modes [0,7] and 18, or 4 for modes [8,14].
- The number of subsets is [1,3].
- The total number of endpoint values is num_comps * 2 * num_subsets.
- The number of planes is either [1,2].
- The total number of weight values is either 16 (non-dual plane modes) or 32 (dual plane modes).
- Dual plane modes always have 1 subset in UASTC.
- For compatibility with BC7, BISE is not used at all for weight indices, only endpoints. Weight indices are always 1-5-bits.
- Various endpoint value ordering examples (UASTC and ASTC use the same endpoint orderings):
1 subset LA: LL0 LH0 AL0 AH0
1 subset RGB: RL0 RH0 GL0 GH0 BL0 BH0
1 subset RGBA: RL0 RH0 GL0 GH0 BL0 BH0 AL0 AH0

2 subset LA: LL0 LH0 AL0 AH0 LL1 LH1 AL1 AH1
2 subset RGB: RL0 RH0 GL0 GH0 BL0 BH0 RL1 RH1 GL1 GH1 BL1 BH1
2 subset RGBA: RL0 RH0 GL0 GH0 BL0 BH0 AL0 AH0 RL1 RH1 GL1 GH1 BL1 BH1 AL1 AH1

In dual plane mode, the UASTC components are NOT reordered like they would be in BC7. The COMPSEL field corresponds to the ASTC CCS field, which indicates which color component to separately interpolate with the 2nd plane weight indices:

Dual plane RGB: RL0 RH0 GL0 GH0 BL0 BH0
Dual plane RGBA: RL0 RH0 GL0 GH0 BL0 BH0 AL0 AH0

- Transcoding UASTC->ASTC is always a 100% lossless operation. The endpoints may need to be swapped (and the corresponding weight indices inverted) to disable blue contraction, but this is always a lossless transformation.
- The primary source of loss when transcoding UASTC->BC7 is mapping UASTC endpoints to BC7 endpoints. This is done using a simple scale with optional optimal p-bit computation. The UASTC weight indices are either copied as-is, or converted to the closest corresponding BC7 weight indices using a lookup table. The partition patterns are lossless, the weight tables are the same for 2/3-bits and very similar for 4-bits, and the endpoint interpolation method is nearly the same (16-bits in UASTC/ASTC, 8-bits with BC7, and both formats use [0,64] weights with rounding in the linear interpolation).
- Unlike ASTC, the weights are not stored in reverse bit order starting from the end of the block. Instead they are stored immediately following the endpoint bits in regular (LSB first) bit order.
- The ASTC CEM field(s) are either 4 (LA Direct) for modes 15-17, 8 (RGB Direct) for modes 0-7, or 12 (RGBA Direct) for modes 9-14. Blue Contraction isn't supported (i.e. the UASTC endpoints can be in arbitrary order, which we exploit to free up index bits like BC7 does). Mode 8 is void-extent.
- Weight index packing is similar to BC7's: Each subset's endpoints are swapped as necessary so the first weight index (that uses that subset) MSB is 0. If the mode uses a single subset, the first weight index MSB must be 0. For multiple subset modes, the first weight index written for each subset in the partition pattern must have an MSB of 0. The "anchor" texel indices for each pattern can be precomputed and stored in a table, like with BC7. This saves 1, 2, or 3 bits in the packed block which can be repurposed for other uses. In dual plane modes (which are always one subset), the first two weight indices must have an MSB of 0.
- We have not attempted to optimize the block format for efficient hardware RTL implementation.
- The LA Direct modes (15-17) must be transcoded to alpha BC7 modes using a "LLLA" swizzle.
- In the dual plane LA Direct mode (17), the CCS field is always 3.
- UASTC modes 4 and 7 are both 2 subset modes using 2-bit weights and endpoint range 12 (40 levels), so they have the same block encoding. Mode 4 uses the 2 subset partition pattern tables, and mode 7 uses the 2-subset ASTC/3-subset BC7 tables.

Modes:

The format of "WeightRange" and "EndpointRange" is Range: Index (# of Quant Levels).
Format is "field: bit_offset num_bits"

**** Mode: 0 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 8 (16), Subsets: 1, EndpointRange: 19 (192) - BC7 MODE 6 RGB

Mode: 0 4
BC1H0: 4 1
BC1H1: 5 1
ETC1F: 6 1
ETC1D: 7 1
ETC1I0: 8 3
ETC1I1: 11 3
ETC1BIAS: 14 5
ETQ: 19 8
ETQ: 27 2
EBITS: 29 6
EBITS: 35 6
EBITS: 41 6
EBITS: 47 6
EBITS: 53 6
EBITS: 59 6
WEIGHTS: 65 63
Total bits: 128, endpoint bits: 46, weight bits: 63

**** Mode: 1 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 3

Mode: 0 6
BC1H0: 6 1
BC1H1: 7 1
ETC1F: 8 1
ETC1D: 9 1
ETC1I0: 10 3
ETC1I1: 13 3
ETC1BIAS: 16 5
EBITS: 21 8
EBITS: 29 8
EBITS: 37 8
EBITS: 45 8
EBITS: 53 8
EBITS: 61 8
WEIGHTS: 69 31
Total bits: 100, endpoint bits: 48, weight bits: 31

**** Mode: 2 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 5 (8), Subsets: 2, EndpointRange: 8 (16) - BC7 MODE 1

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
PAT: 20 5
EBITS: 25 4
EBITS: 29 4
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
EBITS: 65 4
EBITS: 69 4
WEIGHTS: 73 46
Total bits: 119, endpoint bits: 48, weight bits: 46

**** Mode: 3 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange : 2 (4), Subsets : 3, EndpointRange : 7 (12) - BC7 MODE 2

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
PAT: 20 4
ETQ: 24 8
ETQ: 32 8
ETQ: 40 8
ETQ: 48 5
EBITS: 53 2
EBITS: 55 2
EBITS: 57 2
EBITS: 59 2
EBITS: 61 2
EBITS: 63 2
EBITS: 65 2
EBITS: 67 2
EBITS: 69 2
EBITS: 71 2
EBITS: 73 2
EBITS: 75 2
EBITS: 77 2
EBITS: 79 2
EBITS: 81 2
EBITS: 83 2
EBITS: 85 2
EBITS: 87 2
WEIGHTS: 89 29
Total bits: 118, endpoint bits: 65, weight bits: 29

**** Mode: 4 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 12 (40) - BC7 MODE 3

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
PAT: 20 5
ETQ: 25 7
ETQ: 32 7
ETQ: 39 7
ETQ: 46 7
EBITS: 53 3
EBITS: 56 3
EBITS: 59 3
EBITS: 62 3
EBITS: 65 3
EBITS: 68 3
EBITS: 71 3
EBITS: 74 3
EBITS: 77 3
EBITS: 80 3
EBITS: 83 3
EBITS: 86 3
WEIGHTS: 89 30
Total bits: 119, endpoint bits: 64, weight bits: 30

**** Mode: 5 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 5 (8), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 6 RGB

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
EBITS: 20 8
EBITS: 28 8
EBITS: 36 8
EBITS: 44 8
EBITS: 52 8
EBITS: 60 8
WEIGHTS: 68 47
Total bits: 115, endpoint bits: 48, weight bits: 47

**** Mode: 6 (CEM 8 - RGB Direct)
DualPlane: 1, WeightRange: 2 (4), Subsets: 1, EndpointRange: 18 (160) - BC7 MODE 5 RGB

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
COMPSEL: 20 2
ETQ: 22 7
ETQ: 29 7
EBITS: 36 5
EBITS: 41 5
EBITS: 46 5
EBITS: 51 5
EBITS: 56 5
EBITS: 61 5
WEIGHTS: 66 62
Total bits: 128, endpoint bits: 44, weight bits: 62

**** Mode: 7 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 12 (40) - BC7 MODE 2

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
PAT: 20 5
ETQ: 25 7
ETQ: 32 7
ETQ: 39 7
ETQ: 46 7
EBITS: 53 3
EBITS: 56 3
EBITS: 59 3
EBITS: 62 3
EBITS: 65 3
EBITS: 68 3
EBITS: 71 3
EBITS: 74 3
EBITS: 77 3
EBITS: 80 3
EBITS: 83 3
EBITS: 86 3
WEIGHTS: 89 30
Total bits: 119, endpoint bits: 64, weight bits: 30

**** Mode: 8 (Void-Extent)
Void-Extent: Solid Color RGBA (BC7 MODE 5 or MODE 6)

Mode: 0 5
R: 5 8
G: 13 8
B: 21 8
A: 29 8
ETC1D: 37 1
ETC1I: 38 3
ETC1S: 41 2
ETC1R: 43 5
ETC1G: 48 5
ETC1B: 53 5

**** Mode: 9 (CEM 12 - RGBA Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 8 (16) - BC7 MODE 7

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
ETC2TM: 20 8
PAT: 28 5
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
EBITS: 65 4
EBITS: 69 4
EBITS: 73 4
EBITS: 77 4
EBITS: 81 4
EBITS: 85 4
EBITS: 89 4
EBITS: 93 4
WEIGHTS: 97 30
Total bits: 127, endpoint bits: 64, weight bits: 30

**** Mode: 10 (CEM 12 - RGBA Direct)
DualPlane: 0, WeightRange: 8 (16), Subsets: 1, EndpointRange: 13 (48) - BC7 MODE 6

Mode: 0 3
BC1H0: 3 1
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
ETQ: 20 8
ETQ: 28 5
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
WEIGHTS: 65 63
Total bits: 128, endpoint bits: 45, weight bits: 63

**** Mode: 11 (CEM 12 - RGBA Direct)
DualPlane: 1, WeightRange: 2 (4), Subsets: 1, EndpointRange: 13 (48) - BC7 MODE 5

Mode: 0 2
BC1H0: 2 1
ETC1F: 3 1
ETC1D: 4 1
ETC1I0: 5 3
ETC1I1: 8 3
ETC2TM: 11 8
COMPSEL: 19 2
ETQ: 21 8
ETQ: 29 5
EBITS: 34 4
EBITS: 38 4
EBITS: 42 4
EBITS: 46 4
EBITS: 50 4
EBITS: 54 4
EBITS: 58 4
EBITS: 62 4
WEIGHTS: 66 62
Total bits: 128, endpoint bits: 45, weight bits: 62

**** Mode: 12 (CEM 12 - RGBA Direct)
DualPlane: 0, WeightRange: 5 (8), Subsets: 1, EndpointRange: 19 (192) - BC7 MODE 6

Mode: 0 3
BC1H0: 3 1
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
ETQ: 20 8
ETQ: 28 5
EBITS: 33 6
EBITS: 39 6
EBITS: 45 6
EBITS: 51 6
EBITS: 57 6
EBITS: 63 6
EBITS: 69 6
EBITS: 75 6
WEIGHTS: 81 47
Total bits: 128, endpoint bits: 61, weight bits: 47

**** Mode: 13 (CEM 12 - RGBA Direct)
DualPlane: 1, WeightRange: 0 (2), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 5

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
ETC2TM: 20 8
COMPSEL: 28 2
EBITS: 30 8
EBITS: 38 8
EBITS: 46 8
EBITS: 54 8
EBITS: 62 8
EBITS: 70 8
EBITS: 78 8
EBITS: 86 8
WEIGHTS: 94 30
Total bits: 124, endpoint bits: 64, weight bits: 30

**** Mode: 14 (CEM 12 - RGBA Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 6

Mode: 0 5
BC1H0: 5 1
BC1H1: 6 1
ETC1F: 7 1
ETC1D: 8 1
ETC1I0: 9 3
ETC1I1: 12 3
ETC1BIAS: 15 5
ETC2TM: 20 8
EBITS: 28 8
EBITS: 36 8
EBITS: 44 8
EBITS: 52 8
EBITS: 60 8
EBITS: 68 8
EBITS: 76 8
EBITS: 84 8
WEIGHTS: 92 31
Total bits: 123, endpoint bits: 64, weight bits: 31

**** Mode: 15 (CEM 4 - LA Direct)
DualPlane: 0, WeightRange: 8 (16), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 6

Mode: 0 7
BC1H0: 7 1
BC1H1: 8 1
ETC1F: 9 1
ETC1D: 10 1
ETC1I0: 11 3
ETC1I1: 14 3
ETC1BIAS: 17 5
ETC2TM: 22 8
EBITS: 30 8
EBITS: 38 8
EBITS: 46 8
EBITS: 54 8
WEIGHTS: 62 63
Total bits: 125, endpoint bits: 32, weight bits: 63

**** Mode: 16 (CEM 4 - LA Direct)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 20 (256) - BC7 MODE 7

Mode: 0 6
BC1H0: 6 1
BC1H1: 7 1
ETC1F: 8 1
ETC1D: 9 1
ETC1I0: 10 3
ETC1I1: 13 3
ETC1BIAS: 16 5
ETC2TM: 21 8
PAT: 29 5
EBITS: 34 8
EBITS: 42 8
EBITS: 50 8
EBITS: 58 8
EBITS: 66 8
EBITS: 74 8
EBITS: 82 8
EBITS: 90 8
WEIGHTS: 98 30
Total bits: 128, endpoint bits: 64, weight bits: 30

**** Mode: 17 (CEM 4 - LA Direct)
DualPlane: 1, WeightRange: 2 (4), Subsets: 1, EndpointRange: 20 (256) - BC7 MODE 5

Mode: 0 6
BC1H0: 6 1
BC1H1: 7 1
ETC1F: 8 1
ETC1D: 9 1
ETC1I0: 10 3
ETC1I1: 13 3
ETC1BIAS: 16 5
ETC2TM: 21 8
EBITS: 29 8
EBITS: 37 8
EBITS: 45 8
EBITS: 53 8
WEIGHTS: 61 62
Total bits: 123, endpoint bits: 32, weight bits: 62

**** Mode: 18 (CEM 8 - RGB Direct)
DualPlane: 0, WeightRange: 11 (32), Subsets: 1, EndpointRange: 11 (32) - BC7 MODE 6

Mode: 0 4
BC1H0: 4 1
BC1H1: 5 1
ETC1F: 6 1
ETC1D: 7 1
ETC1I0: 8 3
ETC1I1: 11 3
ETC1BIAS: 14 5
EBITS: 19 5
EBITS: 24 5
EBITS: 29 5
EBITS: 34 5
EBITS: 39 5
EBITS: 44 5
WEIGHTS: 49 79
Total bits: 128, endpoint bits: 30, weight bits: 79



UASTC/ASTC BISE Ranges Table


Range Bits Trits Quints Levels
0     1    0     0      0-1
1     0    1     0      0-2
2     2    0     0      0-3
                        
3     0    0     1      0-4
4     1    1     0      0-5
5     3    0     0      0-7
                        
6     1    0     1      0-9
7     2    1     0      0-11
8     4    0     0      0-15
                        
9     2    0     1      0-19
10    3    1     0      0-23
11    5    0     0      0-31
                        
12    3    0     1      0-39
13    4    1     0      0-47
14    6    0     0      0-63
                        
15    4    0     1      0-79
16    5    1     0      0-95
17    7    0     0      0-127
                        
18    5    0     1      0-159
19    6    1     0      0-191

20    8    0     0      0-255


UASTC BISE Weight Index Ranges table:

Range    Bits Trits Quints       UASTC Modes                  Levels
0        1                       13                           2
2        2                       1 3 4 6 7 6 9 11 14 16 17    4
5        3                       2 5 12                       8
8        4                       0 10 15                      16
11       5                       18                           32

(UASTC weights do not use trits or quints.)


UASTC BISE Endpoint Ranges table:

Range    Bits Trits Quints       UASTC Modes           Quant. Levels
7        2    1                  3                     12
8        4                       2 9                   16
11       5                       18                    32
12       3          1            4 7                   40
13       4    1                  10 11                 48
18       5          1            6                     160
19       6    1                  0 12                  192
20       8                       1 5 13 14 15 16 17    256


UASTC/BC7 2-subset partition pattern table:


uint32_t TOTAL_ASTC_BC7_COMMON_PARTITIONS2 = 30

struct
{
  int m_bc7_pattern;
  int m_astc_seed;
// if true, invert the BC7 pattern's subset index to match ASTC's subset index
  bool m_invert;
} g_uastc_bc7_common_partitions2[TOTAL_ASTC_BC7_COMMON_PARTITIONS2] =

{
  { 0, 28, false  }, { 1, 20, false }, { 2, 16, true }, { 3, 29, false },
  { 4, 91, true }, { 5, 9, false }, { 6, 107, true }, { 7, 72, true },
  { 8, 149, false }, { 9, 204, true }, { 10, 50, false }, { 11, 114, true },
  { 12, 496, true }, { 13, 17, true }, { 14, 78, false }, { 15, 39, true }, 
  { 17, 252, true }, { 18, 828, true }, { 19, 43, false }, { 20, 156, false }, 
  { 21, 116, false }, { 22, 210, true }, { 23, 476, true }, { 24, 273, false },
  { 25, 684, true }, { 26, 359, false }, { 29, 246, true }, { 32, 195, true },
  { 33, 694, true }, { 52, 524, true }
};

// UASTC pattern table for the 2-subset modes
const uint8_t g_uastc_patterns2[TOTAL_ASTC_BC7_COMMON_PARTITIONS2][16] =
{
   { 0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1 }, { 0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1 }, 
   { 1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0 }, { 0,0,0,1,0,0,1,1,0,0,1,1,0,1,1,1 },
   { 1,1,1,1,1,1,1,0,1,1,1,0,1,1,0,0 }, { 0,0,1,1,0,1,1,1,0,1,1,1,1,1,1,1 }, 
   { 1,1,1,0,1,1,0,0,1,0,0,0,0,0,0,0 }, { 1,1,1,1,1,1,1,0,1,1,0,0,1,0,0,0 },
   { 0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1 }, { 1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0 }, 
   { 0,0,0,0,0,0,0,1,0,1,1,1,1,1,1,1 }, { 1,1,1,1,1,1,1,1,1,1,1,0,1,0,0,0 },
   { 1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0 }, { 1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0 }, 
   { 0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1 }, { 1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0 },
   { 1,0,0,0,1,1,1,0,1,1,1,1,1,1,1,1 }, { 1,1,1,1,1,1,1,1,0,1,1,1,0,0,0,1 }, 
   { 0,1,1,1,0,0,1,1,0,0,0,1,0,0,0,0 }, { 0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0 },
   { 0,0,0,0,1,0,0,0,1,1,0,0,1,1,1,0 }, { 1,1,1,1,1,1,1,1,0,1,1,1,0,0,1,1 }, 
   { 1,0,0,0,1,1,0,0,1,1,0,0,1,1,1,0 }, { 0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0 },
   { 1,1,1,1,0,1,1,1,0,1,1,1,0,0,1,1 }, { 0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0 }, 
   { 1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1 }, { 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0 },
   { 1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0 }, { 1,0,0,1,0,0,1,1,0,1,1,0,1,1,0,0 }

};


UASTC/BC7 3-subset partition pattern table:


uint32_t TOTAL_ASTC_BC7_COMMON_PARTITIONS3 = 11;

struct
{
  uint8_t m_bc7;
  uint16_t m_astc;

// maps ASTC to BC7 subset indices using g_astc_bc7_subset_index_perm_tables[][]
  uint8_t m_astc_to_bc7_perm;
} g_uastc_bc7_common_partitions3[TOTAL_ASTC_BC7_COMMON_PARTITIONS3] =
{
  { 4, 260, 0 },  { 8, 74, 5 },  { 9, 32, 5 },  { 10, 156, 2 },
  { 11, 183, 2 },  { 12, 15, 0 },  { 13, 745, 4 },  { 20, 0, 1 },
  { 35, 335, 1 },  { 36, 902, 5 },  { 57, 254, 0 }
};


const uint8_t g_astc_bc7_subset_index_perm_tables[6][3] = 
{
{ 0, 1, 2 }, { 1, 2, 0 }, { 2, 0, 1 }, { 2, 1, 0 }, { 0, 2, 1 }, { 1, 0, 2 }
};

// UASTC pattern table for the 3-subset modes
const uint8_t g_uastc_patterns3[TOTAL_ASTC_BC7_COMMON_PARTITIONS3][16] =
{
   { 0,0,0,0,0,0,0,0,1,1,2,2,1,1,2,2 }, { 1,1,1,1,1,1,1,1,0,0,0,0,2,2,2,2 }, 
   { 1,1,1,1,0,0,0,0,0,0,0,0,2,2,2,2 }, { 1,1,1,1,2,2,2,2,0,0,0,0,0,0,0,0 },
   { 1,1,2,0,1,1,2,0,1,1,2,0,1,1,2,0 }, { 0,1,1,2,0,1,1,2,0,1,1,2,0,1,1,2 }, 
   { 0,2,1,1,0,2,1,1,0,2,1,1,0,2,1,1 }, { 2,0,0,0,2,0,0,0,2,1,1,1,2,1,1,1 },
   { 2,0,1,2,2,0,1,2,2,0,1,2,2,0,1,2 }, { 1,1,1,1,0,0,0,0,2,2,2,2,1,1,1,1 }, 
   { 0,0,2,2,0,0,1,1,0,0,1,1,0,0,2,2 }

};

UASTC/BC7 2-subset partition pattern table (mapped to the BC7 3-subset patterns, used only in UASTC mode 7):


uint32_t TOTAL_BC73_ASTC2_COMMON_PARTITIONS = 19;

struct
{
uint8_t m_bc73;
uint16_t m_astc2;
// [0,5] - how to modify the BC7 3-subset pattern to match the ASTC pattern (LSB=invert). See convert_subset_index_3_to_2().
uint8_t k;
} g_bc73_uastc2_common_partitions[TOTAL_BC73_ASTC2_COMMON_PARTITIONS] =
{
{ 10, 36, 4 }, { 11, 48, 4 }, { 0, 61, 3 }, { 2, 137, 4 },
{ 8, 161, 5 }, { 13, 183, 4 }, { 1, 226, 2 }, { 33, 281, 2 },
{ 40, 302, 3 }, { 20, 307, 4 }, { 21, 479, 0 }, { 58, 495, 3 },
{ 3, 593, 0 }, { 32, 594, 2 }, { 59, 605, 1 }, { 34, 799, 3 },
{ 20, 812, 1 }, { 14, 988, 4 }, { 31, 993, 3 }
};

uint32_t convert_subset_index_3_to_2(uint32_t p, uint32_t k)
{
    assert(k < 6);
    switch (k >> 1)
    {
    case 0:
        if (p <= 1)
            p = 0;
        else 
            p = 1;
        break;
    case 1:
        if (p == 0)
            p = 0;
        else 
            p = 1;
        break;
    case 2:
        if ((p == 0) || (p == 2))
            p = 0;
        else 
            p = 1;
        break;
    }
    if (k & 1)
        p = 1 - p;
    return p;
}


// UASTC pattern table for UASTC mode 7 (2 subset UASTC, 3-subset BC7)
const uint8_t g_bc7_3_uastc2_patterns2[TOTAL_BC7_3_ASTC2_COMMON_PARTITIONS][16] =
{
   { 0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0 }, { 0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0 }, 
   { 1,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0 }, { 0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,1 },
   { 1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1 }, { 0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0 }, 
   { 0,0,0,1,0,0,1,1,1,1,1,1,1,1,1,1 }, { 0,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1 },
   { 1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0 }, { 0,1,1,1,0,1,1,1,0,0,0,0,0,0,0,0 }, 
   { 0,0,0,0,0,0,0,0,1,1,1,0,1,1,1,0 }, { 1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0 },
   { 0,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0 }, { 0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1 }, 
   { 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0 }, { 1,1,0,0,1,1,0,0,1,1,0,0,1,0,0,0 },
   { 1,1,1,1,1,1,1,1,1,0,0,0,1,0,0,0 }, { 0,0,1,1,0,1,1,0,1,1,0,0,1,0,0,0 }, 
   { 1,1,1,1,0,1,1,1,0,0,0,0,0,0,0,0 }

};


UASTC Weight Tables


These 6-bit weight tables are used for endpoint interpolation in UASTC. They are the same as ASTC's.

const uint32_t g_astc_bc7_weights1[2] = { 0, 64 };
const uint32_t g_astc_bc7_weights2[4] = { 0, 21, 43, 64 };
const uint32_t g_astc_bc7_weights3[8] = { 0, 9, 18, 27, 37, 46, 55, 64 };
const uint32_t g_astc_weights4[16] = { 0, 4, 8, 12, 17, 21, 25, 29, 35, 39, 43, 47, 52, 56, 60, 64 };
const uint32_t g_astc_weights5[32] = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64 };

Note BC7 and ASTC use the same 2 and 3 bit weight tables, while the 4-bit tables are slightly different:

const uint32_t g_bc7_weights4[16] = { 0, 4, 9, 13, 17, 21, 26, 30, 34, 38, 43, 47, 51, 55, 60, 64 };

A UASTC encoder can work around this difference by evaluating the combined ASTC and transcoded BC7 error and choosing the mode/partition pattern/compsel/etc. configuration that minimizes the overall error.


UASTC Partition Pattern Anchor Index Tables


The texel indices in these tables indicate, for each common UASTC/BC7 pattern, which weight indices are stored with one less bit than normal. The texel indices are computed as x+y*4. 

The MSB's of weight indices stored with one less bit must be 0. If they aren't the endpoints corresponding to that subset are swapped and that subset's weights are inverted before packing the UASTC block. BC7 uses a similar concept.

Anchor weight indices are applied to weight indices in both planes in dual plane UASTC modes.

The first texel is always an anchor.

const uint8_t g_uastc_pattern2_anchors[TOTAL_ASTC_BC7_COMMON_PARTITIONS2][2] = 
{
   { 0, 2 }, { 0, 3 }, { 1, 0 }, { 0, 3 }, { 7, 0 }, { 0, 2 }, { 3, 0 }, 
   { 7, 0 }, { 0, 11 }, { 2, 0 }, { 0, 7 }, { 11, 0 }, { 3, 0 }, { 8, 0 }, 
   { 0, 4 }, { 12, 0 }, { 1, 0 }, { 8, 0 }, { 0, 1 }, { 0, 2 }, { 0, 4 }, 
   { 8, 0 }, { 1, 0 }, { 0, 2 }, { 4, 0 }, { 0, 1 }, { 4, 0 }, { 1, 0 }, 
   { 4, 0 }, { 1, 0 }
};

const uint8_t g_uastc_pattern3_anchors[TOTAL_ASTC_BC7_COMMON_PARTITIONS3][3] =
{
   { 0, 8, 10 },  { 8, 0, 12 }, { 4, 0, 12 }, { 8, 0, 4 }, { 3, 0, 2 }, 
   { 0, 1, 3 }, { 0, 2, 1 }, { 1, 9, 0 }, { 1, 2, 0 }, { 4, 0, 8 }, { 0, 6, 2 }
};

const uint8_t g_bc7_3_uastc2_patterns2_anchors[TOTAL_BC7_3_ASTC2_COMMON_PARTITIONS][2] =
{
   { 0, 4 }, { 0, 2 }, { 2, 0 }, { 0, 7 }, { 8, 0 }, { 0, 1 }, { 0, 3 }, 
   { 0, 1 }, { 2, 0 }, { 0, 1 }, { 0, 8 }, { 2, 0 }, { 0, 1 }, { 0, 7 }, 
   { 12, 0 }, { 2, 0 }, { 9, 0 }, { 0, 2 }, { 4, 0 }
};

UASTC Mode Description Tables


const uint32_t TOTAL_ASTC_MODES = 18;

const uint8_t g_uastc_mode_weight_bits[TOTAL_ASTC_MODES]          = { 4, 2, 3, 2, 2, 3, 2, 2,         0,  2, 4, 2, 3, 1, 2,         4, 2, 2,     5 };
const uint8_t g_uastc_mode_weight_ranges[TOTAL_ASTC_MODES]        = { 8, 2, 5, 2, 2, 5, 2, 2,         0,  2, 8, 2, 5, 0, 2,         8, 2, 2,     11 };
const uint8_t g_uastc_mode_endpoint_ranges[TOTAL_ASTC_MODES]      = { 19, 20, 8, 7, 12, 20, 18, 12,   0,  8, 13, 13, 19, 20, 20,    20, 20, 20,  11 };
const uint8_t g_uastc_mode_subsets[TOTAL_ASTC_MODES]              = { 1, 1, 2, 3, 2, 1, 1, 2,         0,  2, 1, 1, 1, 1, 1,         1, 2, 1,     1 };
const uint8_t g_uastc_mode_planes[TOTAL_ASTC_MODES]               = { 1, 1, 1, 1, 1, 1, 2, 1,         0,  1, 1, 2, 1, 2, 1,         1, 1, 2,     1 };
const uint8_t g_uastc_mode_comps[TOTAL_ASTC_MODES]                = { 3, 3, 3, 3, 3, 3, 3, 3,         4,  4, 4, 4, 4, 4, 4,         2, 2, 2,     3 };
const uint8_t g_uastc_mode_has_etc1_bias[TOTAL_ASTC_MODES]        = { 1, 1, 1, 1, 1, 1, 1, 1,         0,  1, 0, 0, 0, 1, 1,         1, 1, 1,     1 };
const uint8_t g_uastc_mode_has_bc1_hint0[TOTAL_ASTC_MODES]        = { 1, 1, 1, 1, 1, 1, 1, 1,         0,  1, 1, 1, 1, 1, 1,         1, 1, 1,     1 };
const uint8_t g_uastc_mode_has_bc1_hint1[TOTAL_ASTC_MODES]        = { 1, 1, 1, 1, 1, 1, 1, 1,         0,  1, 0, 0, 0, 1, 1,         1, 1, 1,     1 };
const uint8_t g_uastc_mode_cem[TOTAL_ASTC_MODES]                  = { 8, 8, 8, 8, 8, 8, 8, 8,         0,  12, 12, 12, 12, 12, 12,   4, 4, 4,     8 };
const uint8_t g_uastc_mode_has_alpha[TOTAL_ASTC_MODES]            = { 0, 0, 0, 0, 0, 0, 0, 0,         1,  1, 1, 1, 1, 1, 1,         1, 1, 1,     0 };
const uint8_t g_uastc_mode_is_la[TOTAL_ASTC_MODES]                = { 0, 0, 0, 0, 0, 0, 0, 0,         0,  0, 0, 0, 0, 0, 0,         1, 1, 1,     0 };

UASTC to BC7 Mode Conversion Table


UASTC Mode   BC7 Mode    Implementation Notes
0            6           
1            3           set both endpoints to same colors/pbits
2            1           
3            2
4            3
5            6           convert weights from 3 to 4 bits
6            5           swap CCS component with alpha
7            2           two endpoints will be set to same colors
8            5 or 6      choose mode with lowest error
9            7
10           6
11           5           swap CCS component with alpha
12           6           convert weights from 3->4 bits
13           5           convert weights from 1->2 bits
14           6           convert weights from 2->4 bits
15           6           convert LA to RGBA(LLLA swizzle)
16           7           convert LA to RGBA (LLLA swizzle)
17           5           convert LA to RGBA (LLLA swizzle)
18           6           convert weights from 5->4 bits

UASTC to BC7 Weight Index Conversion


If the UASTC mode's number of weight index bits matches the BC7 mode's index bits, then nothing needs to be done to convert UASTC weight indices to BC7 indices. In the cases that differ, use these tables to translate UASTC weight indices to BC7 indices:

UASTC 1-bit to BC7 2-bit conversion table: { 0, 3 }
UASTC 2-bit to BC7 4-bit conversion table: { 0, 5, 10, 15 }
UASTC 3-bit to BC7 4-bit conversion table: { 0, 2, 4, 6, 9, 11, 13, 15 }
UASTC 5-bit to BC7 4-bit conversion table: { 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 6, 7, 8, 9, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15 };

BC7 requires the "anchor" indices to have MSB's of 0. The UASTC->BC7 transcoder needs to check the anchor MSB's and swap the appropriate endpoints, because the UASTC anchor indices aren't the same as BC7's.

UASTC to BC7 Endpoint Conversion


First dequantize the UASTC endpoints to [0,255] using the method outlined here:

The ASTC/UASTC endpoint ranges that use trits or quints must be decoded/dequantized using the parameters in Table 158. (Beware, in case you're not familiar with ASTC: The encoded values of ranges using trits/quints do not dequantize to [0,255] values in a monotonic order. This is why you need to apply the parameters in Table 158 to unquantize the encoded value. This may seem unintuitive, but this is supposed to simplify ASTC decoding hardware.)

Next, for modes with p-bits, divide the dequantized endpoints by 255.0 and compute the optimal quantized BC7 endpoint colors/p-bits using the helper functions included below. 

If the mode doesn't have p-bits, then scale the dequantized endpoint value to the desired number of BC7 endpoint bits, with rounding. To convert 8-bit endpoint components to 5-bits compute (value*31+127)/255, and to convert 8-bits to 7 compute (value*127+127)/255. 

The UASTC encoder assumes this is how the BC7 endpoints will be computed during transcoding. If you do something else the encoder's error computation won't be accurate.

Computing Optimal BC7 P-Bits


To compute the optimal shared or unique BC7 p-bits, we use the following two helper functions. (This is tricky enough, and almost always messed up in the BC7 encoders that we've seen, that we're just including the exact code we use.) Our UASTC encoder assumes the p-bits are computed in this way.

Inputs:
total_comps: 3 or 4
comp_bits: bits in BC7 output color (7 for mode 6, etc.)
xl/xh: the normalized [0,1] input vectors (dequantize ASTC's endpoints and divide by 255.0)

Outputs: 
bestMinColor, bestMaxColor: quantized colors to pack into BC7 output
best_pbits[2]: p-bits to pack into BC7 output

// Determines the best shared pbits to use to encode xl/xh
static void determine_shared_pbits(
   uint32_t total_comps, uint32_t comp_bits, float xl[4], float xh[4],
   color_quad_u8& bestMinColor, color_quad_u8& bestMaxColor, uint32_t best_pbits[2])
{
   const uint32_t total_bits = comp_bits + 1;
   assert(total_bits >= 4 && total_bits <= 8);

   const int iscalep = (1 << total_bits) - 1;
   const float scalep = (float)iscalep;

   float best_err = 1e+9f;

   for (int p = 0; p < 2; p++)
   {
      color_quad_u8 xMinColor, xMaxColor;
      for (uint32_t c = 0; c < 4; c++)
      {
         xMinColor.m_c[c] = (uint8_t)(clampi(((int)((xl[c] * scalep - p) / 2.0f + .5f)) * 2 + p, p, iscalep - 1 + p));
         xMaxColor.m_c[c] = (uint8_t)(clampi(((int)((xh[c] * scalep - p) / 2.0f + .5f)) * 2 + p, p, iscalep - 1 + p));
      }

      color_quad_u8 scaledLow, scaledHigh;

      for (uint32_t i = 0; i < 4; i++)
      {
         scaledLow.m_c[i] = (xMinColor.m_c[i] << (8 - total_bits));
         scaledLow.m_c[i] |= (scaledLow.m_c[i] >> total_bits);
         assert(scaledLow.m_c[i] <= 255);

         scaledHigh.m_c[i] = (xMaxColor.m_c[i] << (8 - total_bits));
         scaledHigh.m_c[i] |= (scaledHigh.m_c[i] >> total_bits);
         assert(scaledHigh.m_c[i] <= 255);
      }

      float err = 0;
      for (uint32_t i = 0; i < total_comps; i++)
         err += squaref((scaledLow.m_c[i] / 255.0f) - xl[i]) + squaref((scaledHigh.m_c[i] / 255.0f) - xh[i]);

      if (err < best_err)
      {
         best_err = err;
         best_pbits[0] = p;
         best_pbits[1] = p;
         for (uint32_t j = 0; j < 4; j++)
         {
            bestMinColor.m_c[j] = xMinColor.m_c[j] >> 1;
            bestMaxColor.m_c[j] = xMaxColor.m_c[j] >> 1;
         }
      }
   }
}

// Determines the best unique pbits to use to encode xl/xh
static void determine_unique_pbits(
   uint32_t total_comps, uint32_t comp_bits, float xl[4], float xh[4], 
   color_quad_u8 &bestMinColor, color_quad_u8 &bestMaxColor, uint32_t best_pbits[2])
{
   const uint32_t total_bits = comp_bits + 1;
   const int iscalep = (1 << total_bits) - 1;
   const float scalep = (float)iscalep;

   float best_err0 = 1e+9f;
   float best_err1 = 1e+9f;

   for (int p = 0; p < 2; p++)
   {
      color_quad_u8 xMinColor, xMaxColor;

      for (uint32_t c = 0; c < 4; c++)
      {
         xMinColor.m_c[c] = (uint8_t)(clampi(((int)((xl[c] * scalep - p) / 2.0f + .5f)) * 2 + p, p, iscalep - 1 + p));
         xMaxColor.m_c[c] = (uint8_t)(clampi(((int)((xh[c] * scalep - p) / 2.0f + .5f)) * 2 + p, p, iscalep - 1 + p));
      }

      color_quad_u8 scaledLow, scaledHigh;
      for (uint32_t i = 0; i < 4; i++)
      {
         scaledLow.m_c[i] = (xMinColor.m_c[i] << (8 - total_bits));
         scaledLow.m_c[i] |= (scaledLow.m_c[i] >> total_bits);
         assert(scaledLow.m_c[i] <= 255);

         scaledHigh.m_c[i] = (xMaxColor.m_c[i] << (8 - total_bits));
         scaledHigh.m_c[i] |= (scaledHigh.m_c[i] >> total_bits);
         assert(scaledHigh.m_c[i] <= 255);
      }

      float err0 = 0, err1 = 0;
      for (uint32_t i = 0; i < total_comps; i++)
      {
         err0 += squaref(scaledLow.m_c[i] - xl[i] * 255.0f);
         err1 += squaref(scaledHigh.m_c[i] - xh[i] * 255.0f);
      }

      if (err0 < best_err0)
      {
         best_err0 = err0;
         best_pbits[0] = p;

         bestMinColor.m_c[0] = xMinColor.m_c[0] >> 1;
         bestMinColor.m_c[1] = xMinColor.m_c[1] >> 1;
         bestMinColor.m_c[2] = xMinColor.m_c[2] >> 1;
         bestMinColor.m_c[3] = xMinColor.m_c[3] >> 1;
      }

      if (err1 < best_err1)
      {
         best_err1 = err1;
         best_pbits[1] = p;

         bestMaxColor.m_c[0] = xMaxColor.m_c[0] >> 1;
         bestMaxColor.m_c[1] = xMaxColor.m_c[1] >> 1;
         bestMaxColor.m_c[2] = xMaxColor.m_c[2] >> 1;
         bestMaxColor.m_c[3] = xMaxColor.m_c[3] >> 1;
      }
   }
}

ETC1 Transcoding


For solid color UASTC blocks: ETC1 transcode hints directly follow the 32-bit RGBA block color. These hints indicate which color mode (differential or individual), intensity table index, selector, and block color to use to encode the block with the lowest error.

For non-solid color blocks: The UASTC block must first be unpacked to pixels. After this, the only expensive work required to transcode to ETC1 is computing the subblock average colors and the texel indices. 

There are ETC1 hint fields which indicate how the ETC1 subblocks are flipped, which block color mode to use, each subblock's intensity table index, and how to bias the computed quantized subblock block colors. 

To compute each subblock's quantized block color: First compute each subblocks' average texel color. Then quantize the average subblock colors to 4 or 5-bits/component (depending on the ETC1 block color mode), with rounding. Next, apply the ETC1 bias indicated by the 5-bit UASTC ETC1BIAS field (see the apply_etc1_bias() function below) to each quantized subblock color. Finally, encode the quantized subblock colors in the ETC1 block. In differential mode the second subblock's differential color may need to be clamped to fit into 3-bits/component.

The last step is computing the texel indices. This step can be accelerated by computing the errors in a luma space with RGB component weights of (1,1,1).


color_rgba apply_etc1_bias(color_rgba block_color, uint32_t bias, uint32_t limit, uint32_t subblock)
{
   for (uint32_t c = 0; c < 3; c++)
   {
      static const int s_divs[3] = { 1, 3, 9 };

      int delta = 0;

      switch (bias)
      {
      case 2: delta = subblock ? 0 : ((c == 0) ? -1 : 0); break;
      case 5: delta = subblock ? 0 : ((c == 1) ? -1 : 0); break;
      case 6: delta = subblock ? 0 : ((c == 2) ? -1 : 0); break;

      case 7: delta = subblock ? 0 : ((c == 0) ? 1 : 0); break;
      case 11: delta = subblock ? 0 : ((c == 1) ? 1 : 0); break;
      case 15: delta = subblock ? 0 : ((c == 2) ? 1 : 0); break;

      case 18: delta = subblock ? ((c == 0) ? -1 : 0) : 0; break;
      case 19: delta = subblock ? ((c == 1) ? -1 : 0) : 0; break;
      case 20: delta = subblock ? ((c == 2) ? -1 : 0) : 0; break;

      case 21: delta = subblock ? ((c == 0) ? 1 : 0) : 0; break;
      case 24: delta = subblock ? ((c == 1) ? 1 : 0) : 0; break;
      case 8: delta = subblock ? ((c == 2) ? 1 : 0) : 0; break;

      case 10: delta = -2; break;

      case 27: delta = subblock ? 0 : -1; break;
      case 28: delta = subblock ? -1 : 1; break;
      case 29: delta = subblock ? 1 : 0; break;
      case 30: delta = subblock ? -1 : 0; break;
      case 31: delta = subblock ? 0 : 1; break;

      default:
         delta = ((bias / s_divs[c]) % 3) - 1;
         break;
      }
      
      int v = block_color[c];
      if (v == 0)
      {
         if (delta == -2)
            v += 3;
         else
            v += delta + 1;
      }
      else if (v == (int)limit)
      {
         v += (delta - 1);
      }
      else
      {
         v += delta;
         if ((v < 0) || (v > (int)limit))
            v = (v - delta) - delta;
      }

      assert(v >= 0);
      assert(v <= (int)limit);

      block_color[c] = (uint8_t)v;
   }

   return block_color;
}

BISE Endpoint Encoding Quantization Tables


These tables are used by UASTC encoders to convert endpoint components into integers which are then BISE coded. For example, if the encoder wants to pack component value 204 using endpoint range 4, it would encode the integer 3. These tables are compatible with ASTC's endpoint quantization, and were computed by inverting the procedure outlined in section 23.13.

The format is "{x,y}", where x is the integer to be BISE coded, and y is the unquantized [0,255] value. Only ranges with trits/quints are included here, because binary ranges use simple bit replication from the most significant bit of the value.

Note that some of these ranges aren't currently used in UASTC.


// Range: 4, Levels: 6, Bits: 1, Trits: 1, Quints: 0
{{0,0},{2,51},{4,102},{5,153},{3,204},{1,255}}
// Range: 6, Levels: 10, Bits: 1, Trits: 0, Quints: 1
{{0,0},{2,28},{4,56},{6,84},{8,113},{9,142},{7,171},{5,199},{3,227},{1,255}}
// Range: 7, Levels: 12, Bits: 2, Trits: 1, Quints: 0
{{0,0},{4,23},{8,46},{2,69},{6,92},{10,116},{11,139},{7,163},{3,186},{9,209},{5,232},{1,255}}
// Range: 9, Levels: 20, Bits: 2, Trits: 0, Quints: 1
{{0,0},{4,13},{8,27},{12,40},{16,54},{2,67},{6,80},{10,94},{14,107},{18,121},{19,134},{15,148},{11,161},{7,175},{3,188},{17,201},{13,215},{9,228},{5,242},{1,255}}
// Range: 10, Levels: 24, Bits: 3, Trits: 1, Quints: 0
{{0,0},{8,11},{16,22},{2,33},{10,44},{18,55},{4,66},{12,77},{20,88},{6,99},{14,110},{22,121},{23,134},{15,145},{7,156},{21,167},{13,178},{5,189},{19,200},{11,211},{3,222},{17,233},{9,244},{1,255}}
// Range: 12, Levels: 40, Bits: 3, Trits: 0, Quints: 1
{{0,0},{8,6},{16,13},{24,19},{32,26},{2,32},{10,39},{18,45},{26,52},{34,58},{4,65},{12,71},{20,78},{28,84},{36,91},{6,97},{14,104},{22,110},{30,117},{38,123},{39,132},{31,138},{23,145},{15,151},{7,158},{37,164},{29,171},{21,177},{13,184},{5,190},{35,197},{27,203},{19,210},{11,216},{3,223},{33,229},{25,236},{17,242},{9,249},{1,255}}
// Range: 13, Levels: 48, Bits: 4, Trits: 1, Quints: 0
{{0,0},{16,5},{32,11},{2,16},{18,21},{34,27},{4,32},{20,38},{36,43},{6,48},{22,54},{38,59},{8,65},{24,70},{40,76},{10,81},{26,86},{42,92},{12,97},{28,103},{44,108},{14,113},{30,119},{46,124},{47,131},{31,136},{15,142},{45,147},{29,152},{13,158},{43,163},{27,169},{11,174},{41,179},{25,185},{9,190},{39,196},{23,201},{7,207},{37,212},{21,217},{5,223},{35,228},{19,234},{3,239},{33,244},{17,250},{1,255}}
// Range: 15, Levels: 80, Bits: 4, Trits: 0, Quints: 1
{{0,0},{16,3},{32,6},{48,9},{64,13},{2,16},{18,19},{34,22},{50,25},{66,29},{4,32},{20,35},{36,38},{52,42},{68,45},{6,48},{22,51},{38,54},{54,58},{70,61},{8,64},{24,67},{40,71},{56,74},{72,77},{10,80},{26,83},{42,87},{58,90},{74,93},{12,96},{28,100},{44,103},{60,106},{76,109},{14,112},{30,116},{46,119},{62,122},{78,125},{79,130},{63,133},{47,136},{31,139},{15,143},{77,146},{61,149},{45,152},{29,155},{13,159},{75,162},{59,165},{43,168},{27,172},{11,175},{73,178},{57,181},{41,184},{25,188},{9,191},{71,194},{55,197},{39,201},{23,204},{7,207},{69,210},{53,213},{37,217},{21,220},{5,223},{67,226},{51,230},{35,233},{19,236},{3,239},{65,242},{49,246},{33,249},{17,252},{1,255}}
// Range: 16, Levels: 96, Bits: 5, Trits: 1, Quints: 0
{{0,0},{32,2},{64,5},{2,8},{34,10},{66,13},{4,16},{36,18},{68,21},{6,24},{38,26},{70,29},{8,32},{40,35},{72,37},{10,40},{42,43},{74,45},{12,48},{44,51},{76,53},{14,56},{46,59},{78,61},{16,64},{48,67},{80,70},{18,72},{50,75},{82,78},{20,80},{52,83},{84,86},{22,88},{54,91},{86,94},{24,96},{56,99},{88,102},{26,104},{58,107},{90,110},{28,112},{60,115},{92,118},{30,120},{62,123},{94,126},{95,129},{63,132},{31,135},{93,137},{61,140},{29,143},{91,145},{59,148},{27,151},{89,153},{57,156},{25,159},{87,161},{55,164},{23,167},{85,169},{53,172},{21,175},{83,177},{51,180},{19,183},{81,185},{49,188},{17,191},{79,194},{47,196},{15,199},{77,202},{45,204},{13,207},{75,210},{43,212},{11,215},{73,218},{41,220},{9,223},{71,226},{39,229},{7,231},{69,234},{37,237},{5,239},{67,242},{35,245},{3,247},{65,250},{33,253},{1,255}}
// Range: 18, Levels: 160, Bits: 5, Trits: 0, Quints: 1
{{0,0},{32,1},{64,3},{96,4},{128,6},{2,8},{34,9},{66,11},{98,12},{130,14},{4,16},{36,17},{68,19},{100,20},{132,22},{6,24},{38,25},{70,27},{102,28},{134,30},{8,32},{40,33},{72,35},{104,36},{136,38},{10,40},{42,41},{74,43},{106,44},{138,46},{12,48},{44,49},{76,51},{108,52},{140,54},{14,56},{46,57},{78,59},{110,60},{142,62},{16,64},{48,65},{80,67},{112,68},{144,70},{18,72},{50,73},{82,75},{114,76},{146,78},{20,80},{52,81},{84,83},{116,84},{148,86},{22,88},{54,89},{86,91},{118,92},{150,94},{24,96},{56,97},{88,99},{120,100},{152,102},{26,104},{58,105},{90,107},{122,108},{154,110},{28,112},{60,113},{92,115},{124,116},{156,118},{30,120},{62,121},{94,123},{126,124},{158,126},{159,129},{127,131},{95,132},{63,134},{31,135},{157,137},{125,139},{93,140},{61,142},{29,143},{155,145},{123,147},{91,148},{59,150},{27,151},{153,153},{121,155},{89,156},{57,158},{25,159},{151,161},{119,163},{87,164},{55,166},{23,167},{149,169},{117,171},{85,172},{53,174},{21,175},{147,177},{115,179},{83,180},{51,182},{19,183},{145,185},{113,187},{81,188},{49,190},{17,191},{143,193},{111,195},{79,196},{47,198},{15,199},{141,201},{109,203},{77,204},{45,206},{13,207},{139,209},{107,211},{75,212},{43,214},{11,215},{137,217},{105,219},{73,220},{41,222},{9,223},{135,225},{103,227},{71,228},{39,230},{7,231},{133,233},{101,235},{69,236},{37,238},{5,239},{131,241},{99,243},{67,244},{35,246},{3,247},{129,249},{97,251},{65,252},{33,254},{1,255}}
// Range: 19, Levels: 192, Bits: 6, Trits: 1, Quints: 0
{{0,0},{64,1},{128,2},{2,4},{66,5},{130,6},{4,8},{68,9},{132,10},{6,12},{70,13},{134,14},{8,16},{72,17},{136,18},{10,20},{74,21},{138,22},{12,24},{76,25},{140,26},{14,28},{78,29},{142,30},{16,32},{80,33},{144,34},{18,36},{82,37},{146,38},{20,40},{84,41},{148,42},{22,44},{86,45},{150,46},{24,48},{88,49},{152,50},{26,52},{90,53},{154,54},{28,56},{92,57},{156,58},{30,60},{94,61},{158,62},{32,64},{96,65},{160,66},{34,68},{98,69},{162,70},{36,72},{100,73},{164,74},{38,76},{102,77},{166,78},{40,80},{104,81},{168,82},{42,84},{106,85},{170,86},{44,88},{108,89},{172,90},{46,92},{110,93},{174,94},{48,96},{112,97},{176,98},{50,100},{114,101},{178,102},{52,104},{116,105},{180,106},{54,108},{118,109},{182,110},{56,112},{120,113},{184,114},{58,116},{122,117},{186,118},{60,120},{124,121},{188,122},{62,124},{126,125},{190,126},{191,129},{127,130},{63,131},{189,133},{125,134},{61,135},{187,137},{123,138},{59,139},{185,141},{121,142},{57,143},{183,145},{119,146},{55,147},{181,149},{117,150},{53,151},{179,153},{115,154},{51,155},{177,157},{113,158},{49,159},{175,161},{111,162},{47,163},{173,165},{109,166},{45,167},{171,169},{107,170},{43,171},{169,173},{105,174},{41,175},{167,177},{103,178},{39,179},{165,181},{101,182},{37,183},{163,185},{99,186},{35,187},{161,189},{97,190},{33,191},{159,193},{95,194},{31,195},{157,197},{93,198},{29,199},{155,201},{91,202},{27,203},{153,205},{89,206},{25,207},{151,209},{87,210},{23,211},{149,213},{85,214},{21,215},{147,217},{83,218},{19,219},{145,221},{81,222},{17,223},{143,225},{79,226},{15,227},{141,229},{77,230},{13,231},{139,233},{75,234},{11,235},{137,237},{73,238},{9,239},{135,241},{71,242},{7,243},{133,245},{69,246},{5,247},{131,249},{67,250},{3,251},{129,253},{65,254},{1,255}}


BISE Endpoint Decoding Dequantization Tables


These tables are the inverse of the previous encoding tables. They are used by decoders to convert BISE encoded integers back into endpoint values, and for dequantization to 8-bits. Alternately, you can use the decoding table and procedure outlined in section 23.13 in the ASTC specification.

Example: if the decoder receives the BISE encoded integer 2 in a UASTC mode which uses endpoint range 4, the resulting component value (which ranges from [0,5]) would be 1, which dequantizes to an 8-bit value of 51.

// Range: 4, Levels: 6, Bits: 1, Trits: 1, Quints: 0
{{0,0},{5,255},{1,51},{4,204},{2,102},{3,153}}
// Range: 6, Levels: 10, Bits: 1, Trits: 0, Quints: 1
{{0,0},{9,255},{1,28},{8,227},{2,56},{7,199},{3,84},{6,171},{4,113},{5,142}}
// Range: 7, Levels: 12, Bits: 2, Trits: 1, Quints: 0
{{0,0},{11,255},{3,69},{8,186},{1,23},{10,232},{4,92},{7,163},{2,46},{9,209},{5,116},{6,139}}
// Range: 9, Levels: 20, Bits: 2, Trits: 0, Quints: 1
{{0,0},{19,255},{5,67},{14,188},{1,13},{18,242},{6,80},{13,175},{2,27},{17,228},{7,94},{12,161},{3,40},{16,215},{8,107},{11,148},{4,54},{15,201},{9,121},{10,134}}
// Range: 10, Levels: 24, Bits: 3, Trits: 1, Quints: 0
{{0,0},{23,255},{3,33},{20,222},{6,66},{17,189},{9,99},{14,156},{1,11},{22,244},{4,44},{19,211},{7,77},{16,178},{10,110},{13,145},{2,22},{21,233},{5,55},{18,200},{8,88},{15,167},{11,121},{12,134}}
// Range: 12, Levels: 40, Bits: 3, Trits: 0, Quints: 1
{{0,0},{39,255},{5,32},{34,223},{10,65},{29,190},{15,97},{24,158},{1,6},{38,249},{6,39},{33,216},{11,71},{28,184},{16,104},{23,151},{2,13},{37,242},{7,45},{32,210},{12,78},{27,177},{17,110},{22,145},{3,19},{36,236},{8,52},{31,203},{13,84},{26,171},{18,117},{21,138},{4,26},{35,229},{9,58},{30,197},{14,91},{25,164},{19,123},{20,132}}
// Range: 13, Levels: 48, Bits: 4, Trits: 1, Quints: 0
{{0,0},{47,255},{3,16},{44,239},{6,32},{41,223},{9,48},{38,207},{12,65},{35,190},{15,81},{32,174},{18,97},{29,158},{21,113},{26,142},{1,5},{46,250},{4,21},{43,234},{7,38},{40,217},{10,54},{37,201},{13,70},{34,185},{16,86},{31,169},{19,103},{28,152},{22,119},{25,136},{2,11},{45,244},{5,27},{42,228},{8,43},{39,212},{11,59},{36,196},{14,76},{33,179},{17,92},{30,163},{20,108},{27,147},{23,124},{24,131}}
// Range: 15, Levels: 80, Bits: 4, Trits: 0, Quints: 1
{{0,0},{79,255},{5,16},{74,239},{10,32},{69,223},{15,48},{64,207},{20,64},{59,191},{25,80},{54,175},{30,96},{49,159},{35,112},{44,143},{1,3},{78,252},{6,19},{73,236},{11,35},{68,220},{16,51},{63,204},{21,67},{58,188},{26,83},{53,172},{31,100},{48,155},{36,116},{43,139},{2,6},{77,249},{7,22},{72,233},{12,38},{67,217},{17,54},{62,201},{22,71},{57,184},{27,87},{52,168},{32,103},{47,152},{37,119},{42,136},{3,9},{76,246},{8,25},{71,230},{13,42},{66,213},{18,58},{61,197},{23,74},{56,181},{28,90},{51,165},{33,106},{46,149},{38,122},{41,133},{4,13},{75,242},{9,29},{70,226},{14,45},{65,210},{19,61},{60,194},{24,77},{55,178},{29,93},{50,162},{34,109},{45,146},{39,125},{40,130}}
// Range: 16, Levels: 96, Bits: 5, Trits: 1, Quints: 0
{{0,0},{95,255},{3,8},{92,247},{6,16},{89,239},{9,24},{86,231},{12,32},{83,223},{15,40},{80,215},{18,48},{77,207},{21,56},{74,199},{24,64},{71,191},{27,72},{68,183},{30,80},{65,175},{33,88},{62,167},{36,96},{59,159},{39,104},{56,151},{42,112},{53,143},{45,120},{50,135},{1,2},{94,253},{4,10},{91,245},{7,18},{88,237},{10,26},{85,229},{13,35},{82,220},{16,43},{79,212},{19,51},{76,204},{22,59},{73,196},{25,67},{70,188},{28,75},{67,180},{31,83},{64,172},{34,91},{61,164},{37,99},{58,156},{40,107},{55,148},{43,115},{52,140},{46,123},{49,132},{2,5},{93,250},{5,13},{90,242},{8,21},{87,234},{11,29},{84,226},{14,37},{81,218},{17,45},{78,210},{20,53},{75,202},{23,61},{72,194},{26,70},{69,185},{29,78},{66,177},{32,86},{63,169},{35,94},{60,161},{38,102},{57,153},{41,110},{54,145},{44,118},{51,137},{47,126},{48,129}}
// Range: 18, Levels: 160, Bits: 5, Trits: 0, Quints: 1
{{0,0},{159,255},{5,8},{154,247},{10,16},{149,239},{15,24},{144,231},{20,32},{139,223},{25,40},{134,215},{30,48},{129,207},{35,56},{124,199},{40,64},{119,191},{45,72},{114,183},{50,80},{109,175},{55,88},{104,167},{60,96},{99,159},{65,104},{94,151},{70,112},{89,143},{75,120},{84,135},{1,1},{158,254},{6,9},{153,246},{11,17},{148,238},{16,25},{143,230},{21,33},{138,222},{26,41},{133,214},{31,49},{128,206},{36,57},{123,198},{41,65},{118,190},{46,73},{113,182},{51,81},{108,174},{56,89},{103,166},{61,97},{98,158},{66,105},{93,150},{71,113},{88,142},{76,121},{83,134},{2,3},{157,252},{7,11},{152,244},{12,19},{147,236},{17,27},{142,228},{22,35},{137,220},{27,43},{132,212},{32,51},{127,204},{37,59},{122,196},{42,67},{117,188},{47,75},{112,180},{52,83},{107,172},{57,91},{102,164},{62,99},{97,156},{67,107},{92,148},{72,115},{87,140},{77,123},{82,132},{3,4},{156,251},{8,12},{151,243},{13,20},{146,235},{18,28},{141,227},{23,36},{136,219},{28,44},{131,211},{33,52},{126,203},{38,60},{121,195},{43,68},{116,187},{48,76},{111,179},{53,84},{106,171},{58,92},{101,163},{63,100},{96,155},{68,108},{91,147},{73,116},{86,139},{78,124},{81,131},{4,6},{155,249},{9,14},{150,241},{14,22},{145,233},{19,30},{140,225},{24,38},{135,217},{29,46},{130,209},{34,54},{125,201},{39,62},{120,193},{44,70},{115,185},{49,78},{110,177},{54,86},{105,169},{59,94},{100,161},{64,102},{95,153},{69,110},{90,145},{74,118},{85,137},{79,126},{80,129}}
// Range: 19, Levels: 192, Bits: 6, Trits: 1, Quints: 0
{{0,0},{191,255},{3,4},{188,251},{6,8},{185,247},{9,12},{182,243},{12,16},{179,239},{15,20},{176,235},{18,24},{173,231},{21,28},{170,227},{24,32},{167,223},{27,36},{164,219},{30,40},{161,215},{33,44},{158,211},{36,48},{155,207},{39,52},{152,203},{42,56},{149,199},{45,60},{146,195},{48,64},{143,191},{51,68},{140,187},{54,72},{137,183},{57,76},{134,179},{60,80},{131,175},{63,84},{128,171},{66,88},{125,167},{69,92},{122,163},{72,96},{119,159},{75,100},{116,155},{78,104},{113,151},{81,108},{110,147},{84,112},{107,143},{87,116},{104,139},{90,120},{101,135},{93,124},{98,131},{1,1},{190,254},{4,5},{187,250},{7,9},{184,246},{10,13},{181,242},{13,17},{178,238},{16,21},{175,234},{19,25},{172,230},{22,29},{169,226},{25,33},{166,222},{28,37},{163,218},{31,41},{160,214},{34,45},{157,210},{37,49},{154,206},{40,53},{151,202},{43,57},{148,198},{46,61},{145,194},{49,65},{142,190},{52,69},{139,186},{55,73},{136,182},{58,77},{133,178},{61,81},{130,174},{64,85},{127,170},{67,89},{124,166},{70,93},{121,162},{73,97},{118,158},{76,101},{115,154},{79,105},{112,150},{82,109},{109,146},{85,113},{106,142},{88,117},{103,138},{91,121},{100,134},{94,125},{97,130},{2,2},{189,253},{5,6},{186,249},{8,10},{183,245},{11,14},{180,241},{14,18},{177,237},{17,22},{174,233},{20,26},{171,229},{23,30},{168,225},{26,34},{165,221},{29,38},{162,217},{32,42},{159,213},{35,46},{156,209},{38,50},{153,205},{41,54},{150,201},{44,58},{147,197},{47,62},{144,193},{50,66},{141,189},{53,70},{138,185},{56,74},{135,181},{59,78},{132,177},{62,82},{129,173},{65,86},{126,169},{68,90},{123,165},{71,94},{120,161},{74,98},{117,157},{77,102},{114,153},{80,106},{111,149},{83,110},{108,145},{86,114},{105,141},{89,118},{102,137},{92,122},{99,133},{95,126},{96,129}}


BISE Encoding Example

This section contains a low-level example of UASTC's BISE endpoint encoding method, which is similar to ASTC's.

In BISE coding, each integer value to be coded is divided up into bits and trits, or bits and quints. Trits range between [0,2] and quints [0,4]. 

Let's focus on UASTC mode 0. Mode 0 uses BISE range 19, which has 6 bits and 1 trit per endpoint value. Mode 0 is an opaque RGB mode with a single subset, so there are 6 total components (3 "low" and 3 "high" coordinates which define the single colorspace line's RGB endpoints). Each endpoint component has (2^6)*3 or 192 unique values.

In ASTC/UASTC, endpoint values in ranges using trits or quints must first be converted into quantized integers before BISE packing. Either use the encoding tables above, or see section 23.12 and table 158 in the ASTC specification. Importantly, for trit and quint ranges, these quantized integers are not in monotonic order relative to the dequantized endpoint component values they represent, which can be confusing to developers new to ASTC.

In UASTC, the encoder first sends all the trit or quint values using one code per group, and then it sends the endpoint bit values using one code per endpoint value. This differs from ASTC, which interleaves the trit/quint group values and bits into packets.

Mode 0's BISE encoded endpoint integer values are packed into the 128-bit block like this:

ETQ: 19 8
ETQ: 27 2
EBITS: 29 6
EBITS: 35 6
EBITS: 41 6
EBITS: 47 6
EBITS: 53 6
EBITS: 59 6

Here's an explanation for what this means: The first ETQ ("endpoint-trit-quint") group value was packed into 8-bits at block bit position 19, and the second was packed into 2-bits at block bit position 27. The EBITS ("endpoint bits") values were each coded using 6-bits starting at block bit position 29.

The 6 endpoint components are packed in this order (the same order as ASTC):
RL RH GL GH BL BH

Let's call these 6 quantized integers to be coded E0, E1, E2, E3, E4, and E5, where E0=RL, E1=RH, etc.

First, the bits and trit values for each endpoint integer are computed:

B0=E0 AND 63, T0=FLOOR(E0 / 64)
B1=E1 AND 63, T1=FLOOR(E1 / 64)
etc.

The trit values T0, T1, etc. will all range between [0,2].

Now we have:
0. B0 T0
1. B1 T1
2. B2 T2
3. B3 T3
4. B4 T4
5. B5 T5

The encoder now combines the trits/quints values into groups. For trits, the total number of packed group codes is FLOOR((total_values+4)/5). For quints the total number of packed codes is FLOOR((total_values+2)/3).

Trits are packed in groups of 5 integers at a time into codes of up to 8-bits (because 3*3*3*3*3=243). Quints are packed in groups of 3 integers at a time into codes of up to 7-bits (because 5*5*5=125). 

If the last group is incomplete (like it is in mode 0), it gets packed into the fewest bits necessary to represent the final group value. (Note this differs from how ASTC's BISE works.) See the tables at the end of this section.

Mode 0's trit group values are computed like this:
TRIT_GROUP_VAL0=T0+T1*3+T2*9+T3*27+T4*81
TRIT_GROUP_VAL1=T5

(For quints the multipliers are 5 and 25.)

In mode 0, we only have 6 integers to pack, which corresponds to two groups. The first group has 5 values, and the second incomplete group only has 1 value. The first full group code is sent using 8-bits. The second incomplete group only uses 2-bits because it can only range between [0,2]:

ETQ: 19 8 (TRIT_GROUP_VAL0)
ETQ: 27 2 (TRIT_GROUP_VAL1)

Next, the encoder packs the bit values B0-B5:

EBITS: 29 6 (B0)
EBITS: 35 6 (B1)
EBITS: 41 6 (B2)
EBITS: 47 6 (B3)
EBITS: 53 6 (B4)
EBITS: 59 6 (B5)

The decoder inverts this encoding procedure. It first reads all the group codes, unpacks them into individual trit/quint values, then it combines these trit/quint values with the bit values to compute the endpoint integers.

This table shows the number of bits used to code each possible trit group size:

Group Size       Number of Bits
1                2 
2                4 
3                5 
4                7
5                8

And for quint groups:

Group Size       Number of Bits
1                3
2                5
3                7



Encoded UASTC Block Examples

1. Encoded UASTC block (mode 4):
{ 0xD3,0xFC,0xA6,0x0,0x0,0x0,0x20,0x9,0x82,0x20,0x48,0xFC,0x0,0x80,0x67,0x0 }

Decoded block (as a 4x4 .PNG):



2. Encoded UASTC block (mode 4):
{ 0x13,0xE8,0x6,0x28,0x68,0x0,0x0,0xC7,0xF1,0xA6,0x69,0x6C,0x66,0x66,0x66,0x0 }

Decoded block (as a 4x4 .PNG):


3.Encoded UASTC block (mode 0):
{ 0x61,0xA4,0xB6,0xA,0x29,0x9,0x1A,0xA0,0x0,0x40,0x30,0xE9,0xD8,0xFF,0xFF,0xFF }

Decoded block (as a 4x4 .PNG):