I combined together 3 of our larger Unity asset bundles together into a single .TAR file and here are the current results on my iPhone 4 (800 MHz A4 CPU - 512MB RAM):
LZHAM Compressed from 15209984 to 4999552 bytes
LZHAM Comp time: 112.771710, BPS: 134874.110168
LZHAM Decomp time: 0.895846, BPS: 16978346.767638
Decompression is around 47 cycles per byte on these bundles files, which contain a variety of Unity asset data.
Now LZMA stats (level 9 16MB dictionary, default tuning options):
LZMA compressed from 15209984 to 4726211 bytes
LZMA Comp time: 41.805776, BPS: 363824.942238
LZMA Decomp time: 1.993880, BPS: 7628334.723455LZMA decompression was ~105 cycles/byte.
So LZHAM decompresses this data 2.2x faster. Its ratio is slightly lower, but this can be somewhat compensated for by enabling LZHAM's better parser and compressing offline (with a multicore desktop CPU). This helps a little: 4960935 bytes. By using more frequent Huff table updates (level 3 vs. the default 8) and extreme parsing, I get 4942383 compressed bytes, but decompression is ~18% slower. I'm going to graph all of this data next.
For reference, my iPhone 4's CPU is ~13.6x slower for compression and ~8.5x slower for decompression vs. my Core i7 3.3 GHz desktop CPU (comparing absolute wall time, no multithreading, same settings and file data, etc.).
Update: Here are the testing results after compressing & decompressing all of our uncompressed asset bundles on my iPhone 4. I limited LZHAM's compressor to a dictionary size of 8MB, less frequent table updating (table update speed of 12 vs the default 8), and normal parsing, which limited its ratio a bit vs. running it on desktop.
LZHAM is slower on a few files totaling ~.2% of the data (~320k out of 172MB), from there it rises to between 1.8x-4.8x faster. (Note I'm currently regenerating this graph so LZHAM's dictionary size matches LZMA's.)
1. Red=Speedup, Blue=LZMA compressed size, sorted by compressed size.
2. Red: Speedup, Blue: LZMA_comp_size/LZHAM_comp_size, sorted by speedup.
Would be very interesting to see results on a more modern/popular iOS CPU. The CPUs there have improved quite a bit since A4 :) A7 (iPhone4S) seems to be *the* most popular right now based on our stats http://stats.unity3d.com/mobile/device-ios.html
ReplyDeleteAgreed - I've just hooked up LZHAM v1.0 to our first product. I'll be working on releasing it to github this weekend, then I'll be testing it (and optimizing the inner decode loop) on more devices.
Delete