Re: [kernel.org users] Re: auto-packing on kernel.org? please?

From: Junio C Hamano <junkio@cox.net>
Date: 2005-10-17 18:21:14
Nick Hengeveld <nickh@reactrix.com> writes:

> To get a complete list of objects we do not have yet, fetch will need
> to walk all the trees first and then make another pass to process
> all the missing objects.

Notice I did not say "we do not have yet but we will need" -- I
just said "we do not have yet".

The assumption, which is the property the suggested packing
strategy has, is that older objects that are needed to complete
the history leading to the current tip are packed in those
n-month/n-week packs, so if we do not have them we would likely
be needing them, although we might not have walked that far back
in history yet.

The previous "packing strategy" picture was certainly too
simplified.  Obviously we would not want to repack everything
every week for different periods all the way back -- we would
want to leave old huge pack untouched to help server side (and
mirroring), so instead of having a single "pack optimization
boundary", we would probably need some staggering as well for
archived material.

This is a revised example.

1yr -----
9mo      --------
6mo              ----------
3mo                        ------------------
1mo                              ------------  
2wk                                  --------
1wk                                      ----

We keep track of "the current heads and tags" for each week.
Every week, we can do something like this:

 - rotate the record, and create a new one:
   mv .save/wk11 .save/wk12
   mv .save/wk10 .save/wk11
   mv .save/wk9 .save/wk10
   ...
   mv .save/wk0 .save/wk1
   find .git/refs -type f -print | xargs cat >.save/wk0
 
 - prepare a pack to allow a single pack fetch to bring a
   repository that had everything reachable from wk$N refs
   up-to-date to the current, for selected recent weeks (say N=1,
   2, 4, 12):

   for N in 1 2 4 12
   do
       name=$(git-rev-list --objects \
                 $(sed -e 's/^/^/' .save/wk$N) \
                 $(cat .save/wk0) |
              git-pack-object pack-) &&
       mv pack-$name.* .git/objects/pack/.
   done

   remove the pack files that we created this way last week from
   the repository (if the repository did not have any activity
   during the last week we would have created the same set of
   packs.  make sure we do not remove them).

 - except that, we keep the longest period (i.e. N=12 in this
   example) one every N weeks (that's how 1yr, 9mo, 6mo packs in
   the picture are kept).

This way, really old stuff (say, older than 3mo) will stay
intact and will not be repacked, so people reasonably up-to-date
(within 12 weeks in the example) need to fetch only one pack
(and unpacked objects since the last pack optimization), but
people without the ancient history need to go further back.


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Mon Oct 17 18:21:57 2005

This archive was generated by hypermail 2.1.8 : 2005-10-17 18:22:02 EST