Re: git-fetching from a big repository is slow

From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Date: 2006-12-15 10:15:25
Hi,

On Thu, 14 Dec 2006, Shawn Pearce wrote:

> Geert Bosch <bosch@adacore.com> wrote:
> > Such special magic based on filenames is always a bad idea. Tomorrow  
> > somebody
> > comes with .zip files (oh, and of course .ZIP), then it's .jpg's other
> > compressed content. In the end git will be doing lots of magic and  
> > still perform
> > badly on unknown compressed content.
> > 
> > There is a very simple way of detecting compressed files: just look  
> > at the
> > size of the compressed blob and compare against the size of the  
> > expanded blob.
> > If the compressed blob has a non-trivial size which is close to the  
> > expanded
> > size, assume the file is not interesting as source or target for deltas.
> > 
> > Example:
> >    if (compressed_size > expanded_size / 4 * 3 + 1024) {
> >      /* don't try to deltify if blob doesn't compress well */
> >      return ...;
> >    }
> 
> And yet I get good delta compression on a number of ZIP formatted files 
> which don't get good additional zlib compression (<3%). Doing the above 
> would cause those packfiles to explode to about 10x their current size.

A pity. Geert's proposition sounded good to me.

However, there's got to be a way to cut short the search for a delta 
base/deltification when a certain (maybe even configurable) amount of time 
has been spent on it.

Ciao,
Dscho

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Dec 15 10:16:41 2006

This archive was generated by hypermail 2.1.8 : 2006-12-15 10:19:31 EST