Richard Geldreich's Blog: Announcing CPNG, "Compatible Network Graphics", a backwards compatible fork of PNG

CPNG ("Compatible Network Graphics") is a 100% backwards compatible fork of the ~30 year old PNG image format, which is still thoroughly, and probably forever, stuck in the 1990's. I've been quietly designing, prototyping, and shipping CPNG's key features over the past couple years on github.

Why continue messing with PNG at all? Because if you can figure out how to squeeze new features into PNG in a backwards compatible way, it's instantly compatible with all browsers, OS's, engines, etc. This is valuable.

I realized, as long as my software is significantly faster for encoding/decoding (>10x encode, >2-3x decode - per thread) and is 100% backwards compatible, I'll have gathered the momentum and library adoption to also add some badly needed new features: faster encoding/decoding by constraining and simplifying the Deflate stream to pixels vs. bytes, SIMD encoding, multithreading so performance can scale up to modern systems, and real HDR pixels (FP16 IEEE half float and LOGLUV32) so it scales to modern and future HDR monitors/tool chains.

Crucially, the CPNG data stream must be entirely backwards compatible with the existing PNG format, like how color TV was introduced but it still worked on B&W TV's. New features can be added, as long as existing decoders ignore them and return a viewable image:

1. Constrained Deflate for huge (10-25x) encoding speedups vs. most other libraries, and 2-3x decoding speedups: this has shipped already in fpng/fpnge.

I documented the Constrained Deflate feature utilized by fpng here:

https://github.com/richgel999/fpng/wiki/fdEC-PNG-chunk-definition

Other encoder/decoder authors can utilize this feature today, just by following this spec. CPNG will leverage this approach at first, before moving to a SIMD encoding approach (like Google's fpnge utilizes).

2. Multithreaded encoding/decoding using multiple IDAT's: Apple does this already.

We're putting a seek (or offset) table into the CPNG ancillary chunk. The image will be optionally parallel encodable and decodable in strips.

This also parallelizes the painful Alder-32 and CRC-32 steps. To remain backwards compatible, the CPNG format still has to (somewhat ridiculously) compute both checksums, putting it at a disadvantage vs. other modern image formats. (Who thought putting two checksums into an image format was a good idea, anyway?)

3. FP16 (IEEE 754-2008 half float) support, but with a lossless tone mapped fallback. One approach has already been demonstrated in the png16 repo. This feature exploits PNG's already built-in support for 16-bit pixels.

Here are some encoded 48-bit PNG example images. These images are losslessly tone mapped:

https://github.com/richgel999/png16/tree/main/bin

These PNG files are tone mapped 16-bit images, but internally they are also standard half float images. A small 256 entry lookup table is stored in an ancillary chunk, so a CPNG HDR decoder can retrieve the half float pixels from the 16-bit unsigned pixels. See the png16 repo source for more details on how this approach works.

HDR CPNG images must remain viewable, in some reasonable way, by all browsers and OS's that only support PNG, even if they are HDR images. This requires a lossless tonemap operator of some sort, so the legacy PNG image is viewable as an SDR image, but the HDR data can always be losslessly recovered using a simple and very fast procedure.

4. LOGLUV32 - This is Ward's very space efficient, perceptually lossless HDR pixel format:

"The LogLuv Encoding for Full Gamut, High Dynamic Range Images, Gregory Ward Larson, Silicon Graphics, Inc."

The L16 portion is stored in the legacy PNG image as a 16-bit grayscale image, so the basic image remains viewable as a SDR PNG file. We'll have to losslessly tonemap these log luminance pixels, using a technique similar to the approach used for FP16 pixels.

The U8V8 portion (color) is stored in ancillary chunk(s), compressed as another 8,8 image. For alpha support, the CPNG file can exploit the LA16 vs. L16 portion of the PNG file. These ancillary chunks will be ignored by old PNG readers.

The separate chunk for UV color data is needed, otherwise I don't see how the 16-bits of UV color can be placed into the PNG without older viewers displaying a totally unviewable image. Older PNG readers viewing/previewing just tone mapped luminance should be fine.

Half float CPNG is an interchangeable alternative to ILM's .EXR format. So why not just use .EXR? .EXR supports too many compression formats so it's not interchangeable in practice. It also doesn't support a lossless tone mapped fallback, which we can do with CPNG, and browsers and most OS's can't view/preview .EXR files (but they can view/preview .PNG).

The full OpenEXR repo is many dozens of source files, and the limited "TinyEXR" codec is 10,000 source code lines (!), not including miniz/Deflate. TinyEXR can only read a fraction of the .EXR files used in the wild, because many .EXR images utilize compression methods it doesn't support.

LOGLUV32 gets CPNG to true HDR, but avoids the file size bloat. All the other claimed "HDR" solutions I've heard about for PNG require storing 16-bit pixels, even if only 10-12 bits are actually utilized. This is wasteful, and I think many won't bother with it. I'm still designing the LOGLUV32 feature.

I'm currently collaborating with another PNG library author to support half float PNG's. The png16 repo internally leverages lodepng's support for 16-bit images.

At the end of the day, as long as we remain backwards compatible, the library authors are the ones who actually control the format.

Richard Geldreich's Blog

Tuesday, December 5, 2023

Announcing CPNG, "Compatible Network Graphics", a backwards compatible fork of PNG

No comments:

Post a Comment