LZHAM: Faster decompression by breaking out literals/delta literals/matches to separate entropy coded blocks, switch to new entropy decoder. Other ideas: multithreaded entropy decoding, combine multiple binary symbols into single non-binary symbols.
ZSTD: Refine current implementation: stronger compressor, profile and optimize all loops.
BROTLI: Brotli's place on the decompression frontier is currently too fuzzy on the ratio axis, so an easy prediction is the compressor will get tightened up. Its current entropy coding design may have trouble expanding to the right much further. (The same situation as LZHAM. The fast entropy coding space is moving rapidly right now.)
Note the circle is rough. I tried to roughly match Brotli's ratio in region 4, but it could go higher to be closer to LZMA/LZHAM.
Some ideas: blocked interleaved entropy coding with SIMD optimizations, entropy decoding in parallel with decompression, near-optimal parsing with best of X arrivals, cloud compression to search through hundreds of compression options, universal preprocessing, LZMA-like instruction set/state machine, rep matches with relative distances, partial matches with compressed fixup sideband.
Until very recently I thought codecs in this region were impossible. (Now, just incredibly challenging!)
Another interesting place to target is to the direct right of region 3 (or directly above region 2). Target this spot too and you've redefined the entire frontier.
Interesting Recent Developments
- Key paper showing several very promising paths forward in the entropy decoding space:
- Squash benchmark - Easily explore the various frontiers with a variety of test data and CPU's.
- Squash library - Universal compression library wrapping over 30 codecs behind a single API with streaming support.
- Zstd - Promising new codec showing interesting ways of breaking up the usual monolithic decoder loop into separate blocks (i.e. it decouples entropy decoding out from the main decoder loop)