Sunday, February 1, 2015

LZHAM 1.0 integrated into 7zip command line and GUI

I integrated the LZHAM codec into the awesome open source 7zip archiver a few years ago for testing purposes, but I was hesitant to release it because I was still frequently changing LZHAM's bitstream. The bitstream is now locked, so here it is in case anyone else out there finds it useful. (Updated: See my next post for a 7zip 9.38 beta compatible codec plugin.)

Important note: Please do *not* depend on this for anything important, this is only for testing purposes. There are very likely bugs in here, and the LZHAM codec id will be changing. I'll be releasing an official codec within the next week or two.

Here's the full source code and prebuilt Windows x86 binaries in the "bin" directory:

Here are new bins fixing a decompressor reinit() problem in the original release (email me if you care about the source - the 7zip related portions are unchanged, I just merged over the latest version of LZHAM):

Note I haven't updated the makefiles yet, just the VS 2008 project files. This has only been tested by me, and I'm not an expert on the very large 7zip codebase (so buyer beware). I did most of this work several years ago, so this is undoubtedly an outdated version of 7zip.
I've only been able to compile the 32-bit version of 7zip so far, so the max. dictionary size is limited to 64MB. (Important note: I'm not trying to fork or break 7zip in any way, this is *only* for testing and fooling around and any archives it makes in LZHAM mode shouldn't be distributed.)

I'll be merging my changes over into the latest version of 7zip, probably next weekend. Also, LZHAM is statically linked in at the moment, I'll be changing this to load LZHAM as a DLL.

Here are some example command line usages (you can also select LZHAM in the GUI too). The method may range from 1-9, just like LZMA, and internally it's converted to the following LZHAM settings. You can use the "-md=16M" or "-md=128K" option to override the dictionary size. The -mmt=on/off option controls threading, which is on by default (i.e. -mmt=on or -mmt=off), and this new option controls deterministic parsing (which defaults to *off*): -mz=on

7z method 1: LZHAM method 0, dict size 2^20
7z method 2: LZHAM method 1, dict size 2^21
7z methods 3-4: LZHAM method 2, dict size 2^22
7z methods 5-6: LZHAM method 3, dict size 2^23
7z methods 7-8: LZHAM method 4, dict size 2^26
7z methods 9: LZHAM method 4 extreme parsing, dict size 2^26 (can be very slow!)

In practice, beware using anything more than -mx=8 ("Maximum" in the GUI) unless you have a very powerful machine and some patience. Also, unless you're on a Core i7 or Xeon LZHAM's compressor will seem very slow to you, because the compressor is totally hamstrung on single core CPU's. (LZHAM is focused on decompression speed combined with very high ratios, so compression speed totally takes back seat.)

Example usage:

E:\dev\lzham\7zipsrc\bin>7z -m0=LZHAM -mx=9 a temp *.dll

7-Zip 9.20 (LZHAM v1.0) Copyright (c) 1999-2010 Igor Pavlov 2010-11-18

Creating archive temp.7z

Compressing 7z.dll

Everything is Ok

E:\dev\lzham\7zipsrc\bin>7z -slt l temp2.7z

7-Zip 9.20 (LZHAM v1.0) Copyright (c) 1999-2010 Igor Pavlov 2010-11-18

Listing archive: temp2.7z

Path = temp2.7z
Type = 7z
Method = LZHAM BCJ
Solid = -
Blocks = 1
Physical Size = 487287
Headers Size = 122

Path = 7z.dll
Size = 1268736
Packed Size = 487165
Modified = 2015-01-31 01:13:33
Attributes = ....A
CRC = 000E5D5E
Encrypted = -
Method = BCJ LZHAM:[1017030000]
Block = 0

7zip GUI:

No comments:

Post a Comment