On Tuesday 03 May 2005 04:06, Nicolas Pitre wrote: > On Mon, 2 May 2005, Linus Torvalds wrote: > > If you do something like this, you want such a delta-blob to be named by > > the sha1 of the result, so that things that refer to it can transparently > > see either the original blob _or_ the "deltified" one, and will never > > care. > > Yep, that's what I've done last weekend (and just made it actually > work since people are getting interested). > My first run didn't go well, diff_delta generates an invalid delta when passed a buffer of length 0. I really should not have been calling it this way, but it should do a quick check and return an error instead of something invalid ;) I did two additional runs, first where I fixed the delta chain length at 1 as in the zdelta patch. In this mode, if it tried to diff against a delta it would find the delta's parent and diff against that instead. Even though zdelta had the same speeds for applying patches as xdiff(1), zdelta used significantly more cpu (53m vs 40m). The next run was with the patch I've attached below, it allows chains up to 16 deltas in length. git zdelta xdiff (1) xdiff(16) apply 150m 117m 117m 104m checkout 4m30s 3m41 4m43s 7m11s checkout (hot) 56s 12s 14s 16s space usage 2.5G 1G 1.2G 800m The longer delta chains trigger more random io on checkout, negating the speed improvements from the packed item patch. The hot cache times show that xdiff isn't using a huge amount of cpu to patch things in, and so there's room for smarter packing and regenerating deltas in order to keep checkout times low. This patch still doesn't pack commits and trees in with the blob files, and it doesn't delta trees, and so I expect better space/speed numbers in later revs. I won't be able to work on this until next week, but here's my plan: 1) update to current git. My patch is from before the safe file generation changes. 2) change update-cache and write-tree so that packing/deltas are off by default. Add --packed and --delta options to both. 3) create a git-pack tool that can pack/unpack existing changesets,trees and files, optionally adding/removing deltas. My current code should preserve the delta object header used by Nicolas, and removes all knowledge of deltas from the packed item headers. This is not quite as efficient, but the resulting code is much cleaner. I haven't tried, but it should be able to read a file created by his mkdelta.c. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This archive was generated by hypermail 2.1.8 : 2005-05-05 01:58:52 EST