Re: Change set based shallow clone

From: Andreas Ericsson <ae@op5.se>
Date: 2006-09-08 18:48:39
Jon Smirl wrote:
> On 9/7/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> I don't understand. Git is _not_ patchset based (like GNU Arch, or
> 
> I meant change set to refer to a commit plus trees plus blobs that
> make it up. These may be present in full or delta form.
> 
>> Mercurial, or CVS). It is snapshot based. So if you want to download
>> "skip", you need only for the local part of doenloader to make 
>> appropriate
>> grafts, like below
>>
>>
>>  *--*--*--*--*--*--*--*--*--*--*--HEAD    (server)
>>
>>  *--*--*...........*--*--*--*--*--HEAD    (shallow/sparse clone)
>>
>> But the part you were talking about is _easy_ part; the hard part is
>> merges including merging branch which was split off the trunk before
>> cutoff-point, history rewriting (c.f. 'pu' branch, and rebases), etc.
> 
> Does an average user do these things? The shallow clone is there to
> address the casual user who gags at a five hour download to get an
> initial check out Mozilla when they want to make a five line change or
> just browse the source for a few minutes.
> 

A better idea would be to allow those users to download a gzipped 
tarball of a pre-grafted repository. It shouldn't be terribly difficult 
to set up an update-hook that creates the pre-grafted repository for you 
whenever a tag (or some such) is created in the repo you host wherever 
everybody does their initial clone from.

As I understand it (although I've admittedly followed the git 
mailing-list sporadically the past three or so months), grafts already 
work as intended, and the users can then fetch into their grafted repo 
to get a bare minimum of objects.

> 
> There would also be a command to bring down all of the objects to
> fully populate a sparse tree. You could do the shallow clone to begin
> with and then do the full tree populate overnight or in the
> background.
> 

With the pre-grafted history this would work as follow

$ mkdir pregraft
$ wget http://pre-grafts.mozilla.org/pregrafted.git.tgz
$ cd pregraft
$ tar xvzf ../pregrafted.git.tgz
$ cd ..
$ git clone mozilla-repo-url >& /dev/null &
$ cd pregraft
# work, work, work; full clone completes
$ cd ../mozilla-repo
$ git pull ../pregraft master

or something similar.

iow, you get the small repo quickly and can start hacking while the 
full-history clone is downloading. If I understand grafts correctly, you 
could then merge in your changes made in the grafted repo to the one 
with full history.

> Maybe the answer is to build a shallow clone tool for casual use, and
> then if you try to run anything too complex on it git just tells you
> that you have to download the entire tree.
> 

I believe all tools that work with history understand grafts already, 
and if so they should provide sane messages when the user attempts to 
access history beyond the grafts. I might have missed or misunderstood 
something, but this seems to me like a simple solution to a complex problem.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Sep 08 18:48:55 2006

This archive was generated by hypermail 2.1.8 : 2006-09-08 18:49:40 EST