Keith Thompson <
[email protected]> writes:
Phil Carmody <[email protected]> writes:
I'm generating a couple of megs of (html) data per day and the data
really doesn't change that much from day to day. Is there an archiver
which will store the complete history of a file, taking advantage of
the knowledge of the previous contents the file?
Any decent source control system (Git, CVS, RCS, etc.) should do the
job. CVS or RCS would do fairly well if the changes can be represented compactly as line-oriented diffs.
I'm giving git a go, and after the occasional git gc it does seem to be
neck and neck with zpaq -m4, but I -m5 would clearly beat it. zpaq's not
able to make use of any similarities between the new and the old files,
when I change the fragment parameter for deduplication, compression gets
worse. A quick comparison of zipping up a set of hand-mangled (to remove anything I know shouldn't be necessary to reproduce each version)
patches is worse than what git does internally, so is probably a dead
end. I'll keep both the zpaq and git running daily until one appears to
be a clear winner.
Cheers,
Phil
--
We are no longer hunters and nomads. No longer awed and frightened, as we have gained some understanding of the world in which we live. As such, we can cast aside childish remnants from the dawn of our civilization.
-- NotSanguine on SoylentNews, after Eugen Weber in /The Western Tradition/
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)