- If you've spent a lot of time working on lowest distortion based texture encoders, your instincts will probably lead you astray once you start working on rate distortion encoders. Distortion can paradoxically increase on a single test even when the rate distortion behavior has improved overall.
- Always plot your results in 2D (rate vs. distortion) - don't focus so much on distortion.
As a quick check of compressor efficiency, compute and display PSNR/bits_per_texel * scale, or SSIM/bits_per_texel * scale (where scale is like 10,000 or something - it's just for readability).
Compute accurate bits_per_texel by actually compressing your output using a real LZ compressor with correct settings. The higher this value, the more efficient the compressor. Use the actual LZ compressor you're shipping the data with.
- Make sure your PSNR, RMSE, MSE, SSIM, etc. calculations are correct and accurate. ALWAYS compare against an independent 3rd party implementation that is known to be correct/trusted. Write your input and output to .PNG/.TGA/.BMP or whatever and use an external 3rd party image comparison tool as a sanity check.
Otherwise you've possibly messed it up and are in the weeds.
One option is ImageMagick.
Here's how to calculate PSNR, and here's some sample code.
- RDO texture encoding+Deflate is basically all about increasing matches above all else. Even adding a single match to a block can be a huge win in a rate distortion sense.
- It's not necessary to worry about how blocks are packed, which modes are supported, or byte alignment. Just focus on byte matches and literals/match estimates.
- Avoid copying around bits. That increases the overall block entropy. Always copy full bytes.
- For more gains you can copy bytes from one offset in a block to another offset. This is way slower to encode but does compress better. I removed this option from bc7enc_rdo because it was so much slower.
- You don't need a huge window to get large gains. Even 64-512 byte windows are fine.
- You don't need a huge window to get large gains. Even 64-512 byte windows are fine.
- You don't need an accurate LZ simulator to make a workable high quality encoder.
Although, I needed one to figure all this out.
- Use an already working RDO encoder as a baseline (even a shitty one). Plot its average R-D curve across a range of settings/images. Go from there.
- By default, a high quality texture encoding will consist of mostly literals. Just focus on inserting a single match into each block from one of the previously encoded blocks. Use the Langrangian multiplier method (j=MSE*smooth_block_scale+bits*lambda) to pick the best one.
- By default, a high quality texture encoding will consist of mostly literals. Just focus on inserting a single match into each block from one of the previously encoded blocks. Use the Langrangian multiplier method (j=MSE*smooth_block_scale+bits*lambda) to pick the best one.
- Use Matt Mahoney's "fv" tool to visualize the entropy of your encoded output data:
http://www.mattmahoney.net/dc/fv.cpp
- You can copy a full block (which is like VQ) or partial byte sequences from one block to another. It's possible that a match can partially cross endpoints and selectors. Just decode the block, calculate MSE, estimate bits and then the Langrangian formula.
- Plot rate distortion curves (PSNR or SSIM vs. bits/texel) for various lambdas and encoder settings. Focus on increasing the PSNR per bit (move the curve up and left).
- You must do something about smooth/flat blocks. Their MSE's are too low relative to the visual impact they have when they get distorted. One solution is to compute the max std dev. of any component and use a linear function of that to scale block/trial MSE.
- Before developing anything more complex than the technique used in bc7enc_rdo (the byte-wise ERT), get this technique working and tuned first. You'll be surprised how challenging it can be to actually improve it.
- You can copy a full block (which is like VQ) or partial byte sequences from one block to another. It's possible that a match can partially cross endpoints and selectors. Just decode the block, calculate MSE, estimate bits and then the Langrangian formula.
- Plot rate distortion curves (PSNR or SSIM vs. bits/texel) for various lambdas and encoder settings. Focus on increasing the PSNR per bit (move the curve up and left).
- You must do something about smooth/flat blocks. Their MSE's are too low relative to the visual impact they have when they get distorted. One solution is to compute the max std dev. of any component and use a linear function of that to scale block/trial MSE.
- Before developing anything more complex than the technique used in bc7enc_rdo (the byte-wise ERT), get this technique working and tuned first. You'll be surprised how challenging it can be to actually improve it.
- Nobody will trust or listen to you when you claim your encoder is better in some way, even if you show them graphs. There are just too many ways to either mess up or bias a benchmark. You need a trusted 3rd party to independently benchmark and validate your encoder vs. other encoders.
The people at Unity have been filling this role recently. (Which makes sense because they integrate a lot of texture encoders into Unity.)
No comments:
Post a Comment