Sunday, October 13, 2013

miniz.c: Finally added zip64 support, other fixes/improvements incoming

I finally needed Zip64 support for the Linux OpenGL debugger I've been working on at work. GL traces and state snapshots can be huge (~4-10GB of data for 3-4 minute game runs is not uncommon), and can consist of tons of binary "blob" files for VB's/IB's/shaders/etc. I looked at some other C archive libraries (libzip, minizip, etc.) but they where either ugly/huge messes with a zillion C/H files, or they didn't fully abstract their file I/O, or they didn't support in-memory archives for both reading/writing, or their licenses sucked, or they weren't thread safe (!), or I just didn't trust them, etc.

So screw it, I'll bite the bullet and do this myself. It's certainly possible I missed a really good library out there. I prefer C for this kind of stuff because most C++ libs I find in the wild use features I can't live with for various reasons, such as C++ exceptions, stl, heavy use of heap memory, Boost, or have tons of other lib dependencies, etc.

The original ancient zip file format was OK and kinda elegant for what is was, and the code to parse and write the original headers was nice and easy. But once you add zip64 it becomes an ugly mess full of conditionals, and copying zip header/archive data from one zip to another can be a big pain because you can't just blindly copy the zip64 extended data fields from the source to destination zip. (You've got to kill the old one from the extended data block and add a new one, etc.) Zip64 is now "done", and I've been running a bunch of automated testing on the new code paths, but I'm worried I've bit off more than I can chew given the very limited time I have to work on this feature for the debugger.

I've also renamed a lot of the zip "reader" API's so they can be used in both reading and writing mode. There's no reason why you can't locate files in the central directory, or get file stats while writing, for example, because the entire zip central directory is kept in memory during writing.

I've added full error codes to all zip archive handling functions. You can get the last error, clear the last error, peek at the last error, etc. I went through the entire thing and made sure the errors are set appropriately, so now you can get more info than just MZ_FALSE when something goes wrong. This change alone took several hours.

In zip64 mode I only support a total of UINT_MAX (2^32-1) files in the central dir, and central dirs are limited to a total of UINT_MAX bytes. These are huge increases from the previous limits, so this should be fine. I'm not writing a hard disk backup utility here after all, so I'm not going to support archives that big right now.

Bugwise, the only major bug I'm worried about in the current public release (miniz.c v1.14) that really worries me is the MZ_ZIP_FLAG_DO_NOT_SORT_CENTRAL_DIRECTORY flag (Issue #11 on the Google Code Issue tracker). I doubt anybody really used this flag, so I'm not worried about that, but a few internal API's used it to speed up loading a little and they could fail. It's bad enough that I'm going to patch v1.14 to fix this tonight.

Also, it's definitely time to split up miniz.c into at least two source files. One for Deflate/Inflate/zlib API emulation/crc32/etc. The other file will be for ZIP archive handling and be (of course) optional.

I'll also try to merge in all the fixes/improvements people have either placed on github, on miniz's Google code bug tracker, or have sent to me privately, if time permits.

Let me know if you are dying to try this version out and I can send you a private copy for testing.

9 comments:

  1. For some reason when you view miniz.c in google code, select-all, copy, and paste the contents of miniz.c into Visual Studio(using 2013) it reformat it and breaks all the macros by adding extra lines that don't end in \
    Downloading the file from google code instead of trying to copy paste it worked though..

    ReplyDelete
  2. Trying clicking on "view raw file", which should browse you to here:
    http://miniz.googlecode.com/svn/trunk/miniz.c
    Or, you can get it via SVN (always the very latest version) or the archive. The archive version occasionally lags a little (minor bugfixes go to SVN first).

    ReplyDelete
  3. Hi Rich,

    Any Plans to include Deflate64?

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. I've managed to use Miniz.c to compress and decompress a binary image on my PC. How can I get this code to work on the embedded system where I have about 5MB of RAM?

    ReplyDelete
  6. I've managed to use Miniz.c to compress and decompress a binary image on my PC. How can I get this code to work on the embedded system where I have about 5MB of RAM?

    ReplyDelete
  7. Hi , sir :
    If it is possiable to add Miniz.c to a small 8 bit micro controller PIC18F46k80(64k flash 3k RAM) and compiling C source code with Hi-tech compiler, and how to do it?

    It's very appreciated for the reply.

    Chung,
    20150526

    ReplyDelete
  8. Hi , sir :
    If it is possiable to add Miniz.c to a small 8 bit micro controller PIC18F46k80(64k flash 3k RAM) and compiling C source code with Hi-tech compiler, and how to do it?

    It's very appreciated for the reply.

    Chung,
    20150526

    ReplyDelete
  9. Hi , sir :
    If it is possiable to add Miniz.c to a small 8 bit micro controller PIC18F46k80(64k flash 3k RAM) and compiling C source code with Hi-tech compiler, and how to do it?

    It's very appreciated for the reply.

    Chung,
    20150526

    ReplyDelete