Wednesday, February 4, 2015

7zip 9.38 custom codec plugin for LZHAM 1.0

Igor Pavlov and the devs at encode.ru gave me some pointers on how to create a custom 7z codec DLL. Here's the first 32-bit build of the 7zip custom codec plugin DLL, compatible with 7zip 9.38 beta x86 (note I've updated this with 1 decompressor bugfix since the first release a couple days ago):

http://www.tenacioussoftware.com/7z_lzhamcodec_test3.7z

http://www.tenacioussoftware.com/7z_...c_test3_src.7z

To use this: install 9.38 beta (I got it from http://sourceforge.net/projects/seve...es/7-Zip/9.38/), manually create a subdirectory named "Codecs" in your install directory (i.e. "C:\Program Files (x86)\7-Zip\Codecs"), and extract LzhamCodec.DLL into this directory. If you run "7z i" you should see this:

Codecs:
Lib ID Name
0 C 303011B BCJ2
0 C 3030103 BCJ
0 C 3030205 PPC
...
0 C 30401 PPMD
0 40301 Rar1
0 40302 Rar2
0 40303 Rar3
0 C 6F10701 7zAES
0 C 6F00181 AES256CBC1 C 90101 LZHAM

The ID used in this DLL (90101) may change - I'm working this out with Igor.

This version's command line parameters are similar to 7z-lzham 9.20:

-mx=[0,9] - Controls LZHAM compression level, dictionary size, and extreme parsing. -mx=9 enables the best of 4 ("extreme" parser), which can be very slow.

Details:
-mx0=lzham_level:0, dict_size_log2:18
1=0,20 (fastest)
2=1,21
3=2,21
4=2,22
5=3,22
6=3,23
7=4,25
8=4,26 (slowest - the default)
9=4,26 best of 4 parsing (even slower)

-mt=on/off or integer - Controls threading (default=on)

-ma=[0,1] - Deterministic parsing. 1=on (default=off)

-md=64K, etc. - Override dictionary size. If you don't set this the dictionary size is inferred via the -mx=# parameter setting.

Example usage:

7z -m0=LZHAM -mx=8 a temp *.dll

To use LZHAM from the GUI, you can put custom parameters on the command line (without -m prefixes), i.e.: "0=bcj 1=lzham x=8". (I think you can only specify algorithm and level options here.) Note if you set the Compression Level pulldown to "Ultra" LZHAM will use best of 4 (extreme) parsing and be pretty slow, so the highest you probably want to use is "Maximum":

Sunday, February 1, 2015

LZHAM 1.0 integrated into 7zip command line and GUI

I integrated the LZHAM codec into the awesome open source 7zip archiver a few years ago for testing purposes, but I was hesitant to release it because I was still frequently changing LZHAM's bitstream. The bitstream is now locked, so here it is in case anyone else out there finds it useful. (Updated: See my next post for a 7zip 9.38 beta compatible codec plugin.)

Important note: Please do *not* depend on this for anything important, this is only for testing purposes. There are very likely bugs in here, and the LZHAM codec id will be changing. I'll be releasing an official codec within the next week or two.

Here's the full source code and prebuilt Windows x86 binaries in the "bin" directory:
http://www.tenacioussoftware.com/7zipsrc_release_lzham_1_0.7z

Here are new bins fixing a decompressor reinit() problem in the original release (email me if you care about the source - the 7zip related portions are unchanged, I just merged over the latest version of LZHAM):
http://www.tenacioussoftware.com/7zipsrc_release3_lzham_1_0.7z

Note I haven't updated the makefiles yet, just the VS 2008 project files. This has only been tested by me, and I'm not an expert on the very large 7zip codebase (so buyer beware). I did most of this work several years ago, so this is undoubtedly an outdated version of 7zip.
I've only been able to compile the 32-bit version of 7zip so far, so the max. dictionary size is limited to 64MB. (Important note: I'm not trying to fork or break 7zip in any way, this is *only* for testing and fooling around and any archives it makes in LZHAM mode shouldn't be distributed.)

I'll be merging my changes over into the latest version of 7zip, probably next weekend. Also, LZHAM is statically linked in at the moment, I'll be changing this to load LZHAM as a DLL.

Here are some example command line usages (you can also select LZHAM in the GUI too). The method may range from 1-9, just like LZMA, and internally it's converted to the following LZHAM settings. You can use the "-md=16M" or "-md=128K" option to override the dictionary size. The -mmt=on/off option controls threading, which is on by default (i.e. -mmt=on or -mmt=off), and this new option controls deterministic parsing (which defaults to *off*): -mz=on

-mx=X:
7z method 1: LZHAM method 0, dict size 2^20
7z method 2: LZHAM method 1, dict size 2^21
7z methods 3-4: LZHAM method 2, dict size 2^22
7z methods 5-6: LZHAM method 3, dict size 2^23
7z methods 7-8: LZHAM method 4, dict size 2^26
7z methods 9: LZHAM method 4 extreme parsing, dict size 2^26 (can be very slow!)

In practice, beware using anything more than -mx=8 ("Maximum" in the GUI) unless you have a very powerful machine and some patience. Also, unless you're on a Core i7 or Xeon LZHAM's compressor will seem very slow to you, because the compressor is totally hamstrung on single core CPU's. (LZHAM is focused on decompression speed combined with very high ratios, so compression speed totally takes back seat.)

Example usage:

E:\dev\lzham\7zipsrc\bin>7z -m0=LZHAM -mx=9 a temp *.dll

7-Zip 9.20 (LZHAM v1.0) Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Scanning

Creating archive temp.7z

Compressing 7z.dll

Everything is Ok

E:\dev\lzham\7zipsrc\bin>7z -slt l temp2.7z

7-Zip 9.20 (LZHAM v1.0) Copyright (c) 1999-2010 Igor Pavlov 2010-11-18

Listing archive: temp2.7z

--
Path = temp2.7z
Type = 7z
Method = LZHAM BCJ
Solid = -
Blocks = 1
Physical Size = 487287
Headers Size = 122

----------
Path = 7z.dll
Size = 1268736
Packed Size = 487165
Modified = 2015-01-31 01:13:33
Attributes = ....A
CRC = 000E5D5E
Encrypted = -
Method = BCJ LZHAM:[1017030000]
Block = 0

7zip GUI:


Sunday, January 25, 2015

LZHAM v1.0 released on github

Here: https://github.com/richgel999/lzham_codec

I haven't merged over the XCode project yet, but it's fully compatible with OSX now. Also, LZHAM v1.0 is not backwards compatible with the previous alpha version.

Saturday, January 24, 2015

Windows 10: An Arrow Aimed Straight at Steam

I find this very interesting news, and if you're not paying attention you should:

Phoronix: Windows 10 To Be A Free Upgrade: What Linux Users Need To Know

PC World: Windows 10's new features: Cortana, a 'Spartan' browser, Xbox streaming, and more

Rock, Paper, Shotgun: Is Windows 10 Good For PC Gamers Or XBone Owners?

I think Microsoft's strategy here is surprisingly well thought out. Their execs finally figured out "to know your enemy you must become your enemy". Windows 10 is free, has a new state of the art graphics API (DirectX v12) created by the best graphics specialists, real software engineers, and real testers in the business, awesome developer tools (Visual Studio) that actually work with real CPU/GPU debugging and profiling support all built in, all of your existing apps and games still work, and they're pulling out all the stops with the Halo/Xbox branding right down into the OS and browser.

They just need to make the Windows 10 App Store not suck: Continue to use their Xbox brand as a lever, carefully feed and nourish the ecosystem, listen to their customers, and undercut the living hell out of Steam. Steam itself started out as a total pile of crap, but they listened to their customers, fixed the problems over time, gave their customers good deals and shipped apps you couldn't get anywhere else at the right prices, and built and nourished the community. Microsoft can do all the same things, and perhaps they've finally figured this out.

It's now all down to execution, recovering from some obviously bone headed moves (sometimes fueled by excessive Redmond Kool-Aid drinking, like the botched Windows 8 UI and no Start Menu disasters), recognizing and quickly recovering from the inevitable new bone headed moves, and sustaining the effort over the long term (something Microsoft has definitely not been very good at except for their core brands). Competition is great.

Friday, January 23, 2015

LZHAM v1.0 is being tested on iOS/OSX/Linux/Win

Currently testing it on a few machines using random codec settings with ~3.5 million files. We also just switched over our title's bundle decompression step from LZMA to LZHAM, so the decompressor will be tested on many iOS devices too.

I've also tested the compression portion of the code on iOS, but I won't be able to get much coverage there before releasing it. Honestly the decompressor is much more interesting for mobile devices (that's really the whole point of LZHAM).

I'll be porting and testing LZHAM on Android within a week or so - should be easy by this point.

LZHAM v1.0 vs. LZMA decompression perf. on iPhone 6+

I borrowed a coworker's iPhone 6+ and reran my bundle compression benchmarking app. According to wikipedia, it's a 1.4 GHz dual-core ARMv8-A.

LZHAM is 2.3x-9x faster on this device, unless the bundle's compressed size is < 1000 bytes. The comp size threshold where LZHAM is faster is lower than what I'm used to seeing, not sure exactly why yet.

1. Bundles sorted by LZHAM vs. LZMA decompression speedup (slowest on left):


2. Bundles sorted by LZMA compressed size (smallest on left), with relative decompression speedup in blue:


Wednesday, January 21, 2015

First LZHAM iOS stats with Unity asset bundle data

Got everything (both the compressor and decompressor) working. Was surprisingly easy. Had 1 misaligned load to deal with in the compressor's match finder because of a #ifdef problem.

I combined together 3 of our larger Unity asset bundles together into a single .TAR file and here are the current results on my iPhone 4 (800 MHz A4 CPU - 512MB RAM):

LZHAM Compressed from 15209984 to 4999552 bytes
LZHAM Comp time: 112.771710, BPS: 134874.110168
LZHAM Decomp time: 0.895846, BPS: 16978346.767638

For compression, I used a 16MB dictionary, highest compression (level 4) with normal parsing. Compression is slow, but LZHAM is designed for offline use so as long as it works at all I'm not sweating it for now.

Decompression is around 47 cycles per byte on these bundles files, which contain a variety of Unity asset data.

Now LZMA stats (level 9 16MB dictionary, default tuning options):

LZMA compressed from 15209984 to 4726211 bytes
LZMA Comp time: 41.805776, BPS: 363824.942238
LZMA Decomp time: 1.993880, BPS: 7628334.723455

LZMA decompression was ~105 cycles/byte.

So LZHAM decompresses this data 2.2x faster. Its ratio is slightly lower, but this can be somewhat compensated for by enabling LZHAM's better parser and compressing offline (with a multicore desktop CPU). This helps a little: 4960935 bytes. By using more frequent Huff table updates (level 3 vs. the default 8) and extreme parsing, I get 4942383 compressed bytes, but decompression is ~18% slower. I'm going to graph all of this data next.

For reference, my iPhone 4's CPU is ~13.6x slower for compression and ~8.5x slower for decompression vs. my Core i7 3.3 GHz desktop CPU (comparing absolute wall time, no multithreading, same settings and file data, etc.).

Update: Here are the testing results after compressing & decompressing all of our uncompressed asset bundles on my iPhone 4. I limited LZHAM's compressor to a dictionary size of 8MB, less frequent table updating (table update speed of 12 vs the default 8), and normal parsing, which limited its ratio a bit vs. running it on desktop.

LZHAM is slower on a few files totaling ~.2% of the data (~320k out of 172MB), from there it rises to between 1.8x-4.8x faster. (Note I'm currently regenerating this graph so LZHAM's dictionary size matches LZMA's.)

1. Red=Speedup, Blue=LZMA compressed size, sorted by compressed size.


2. Red: Speedup, Blue: LZMA_comp_size/LZHAM_comp_size, sorted by speedup.