Richard Geldreich's Blog: First LZHAM iOS stats with Unity asset bundle data

Wednesday, January 21, 2015

First LZHAM iOS stats with Unity asset bundle data

Got everything (both the compressor and decompressor) working. Was surprisingly easy. Had 1 misaligned load to deal with in the compressor's match finder because of a #ifdef problem.

I combined together 3 of our larger Unity asset bundles together into a single .TAR file and here are the current results on my iPhone 4 (800 MHz A4 CPU - 512MB RAM):

LZHAM Compressed from 15209984 to 4999552 bytes

LZHAM Comp time: 112.771710, BPS: 134874.110168

LZHAM Decomp time: 0.895846, BPS: 16978346.767638

For compression, I used a 16MB dictionary, highest compression (level 4) with normal parsing. Compression is slow, but LZHAM is designed for offline use so as long as it works at all I'm not sweating it for now.

Decompression is around 47 cycles per byte on these bundles files, which contain a variety of Unity asset data.

Now LZMA stats (level 9 16MB dictionary, default tuning options):

LZMA compressed from 15209984 to 4726211 bytes

LZMA Comp time: 41.805776, BPS: 363824.942238

LZMA Decomp time: 1.993880, BPS: 7628334.723455

LZMA decompression was ~105 cycles/byte.

So LZHAM decompresses this data 2.2x faster. Its ratio is slightly lower, but this can be somewhat compensated for by enabling LZHAM's better parser and compressing offline (with a multicore desktop CPU). This helps a little: 4960935 bytes. By using more frequent Huff table updates (level 3 vs. the default 8) and extreme parsing, I get 4942383 compressed bytes, but decompression is ~18% slower. I'm going to graph all of this data next.

For reference, my iPhone 4's CPU is ~13.6x slower for compression and ~8.5x slower for decompression vs. my Core i7 3.3 GHz desktop CPU (comparing absolute wall time, no multithreading, same settings and file data, etc.).

Update: Here are the testing results after compressing & decompressing all of our uncompressed asset bundles on my iPhone 4. I limited LZHAM's compressor to a dictionary size of 8MB, less frequent table updating (table update speed of 12 vs the default 8), and normal parsing, which limited its ratio a bit vs. running it on desktop.

LZHAM is slower on a few files totaling ~.2% of the data (~320k out of 172MB), from there it rises to between 1.8x-4.8x faster. (Note I'm currently regenerating this graph so LZHAM's dictionary size matches LZMA's.)

1. Red=Speedup, Blue=LZMA compressed size, sorted by compressed size.

2. Red: Speedup, Blue: LZMA_comp_size/LZHAM_comp_size, sorted by speedup.

2 comments:

NeARAZJanuary 22, 2015 at 10:45 PM
Would be very interesting to see results on a more modern/popular iOS CPU. The CPUs there have improved quite a bit since A4 :) A7 (iPhone4S) seems to be *the* most popular right now based on our stats http://stats.unity3d.com/mobile/device-ios.html
ReplyDelete
Replies

Add comment