It was conducted on a 20 core Xeon workstation across 31 test images (first 24 are the kodim images). Both use AVX instructions. The quality metric is RGB average PSNR, perceptual mode disabled. The PSNR's below are averages across the entire set. The timings are the total amount of CPU time used only calling the encoder functions (across all threads). OpenMP was used for threading, and each encoder was called with 64 blocks per function call.
I'm currently focusing on ispc_texcomp's basic and slow profiles:
ispc_texcomp:
basic profile: 100.5 secs, 46.54 dB, .4631 dB/sec
slow profile: 355.29 secs, 46.77 dB, .1316 dB/sec
uber 0: 56.7 secs, 46.49 dB, .8199 dB/sec
uber 1: 86.4 secs, 46.72 dB, .5407 dB/sec
uber 2: 129.1 secs, 46.79 dB, .3624 dB/sec
uber 2 (2 refinement passes, 16 max 1,3 partitions): 161.9 secs, 46.84 dB, .2893 dB/sec
uber 3 (2 refinement passes, 32 max 1,3 partitions, pbit search): 215.2 secs, 46.91 dB, .2180 dB/sec
uber 4 (2 refinement passes, 64 max 1,3 partitions, pbit search): : 292.5 secs, 46.96 dB, .1605 dB/sec
The dB/sec. values are a simple measure of encoder efficiency. ispc_texcomp's slow profile at .1315 dB/sec. is working very hard for very little quality per unit time compared to its basic profile. The efficiency of both encoders decreases as the quality is improved, but ispc_texcomp falls off very rapidly above basic and mine falls off later. I believe a whole texture encoder like etc2comp's can more efficiently get through the quality barrier here.
What this boils down to: If you use ispc_texcomp, definitely avoid the slow profile (the tiny gain in quality isn't worth it). And it's definitely possible to compete against ispc_texcomp using plain RGB metrics.
Hi Rich, your result is amazing. Is there any chance you can share your code?
ReplyDeleteThanks Hai. I opened sourced a strong subset of my C prototype encoder:
ReplyDeletehttp://richg42.blogspot.com/2018/04/new-bc7-encoder-open-sourced.html