Re: RFC: New diff-delta.c implementation

From: Junio C Hamano <junkio@cox.net>
Date: 2006-04-23 03:29:55
Nicolas Pitre <nico@cam.org> writes:

> Well, actually I was measuring a 10% speed improvement with a quick and 
> naive (not memory efficient) approach for pack-objects with the current 
> algorithm.
>...
> The idea to avoid memory pressure is to reverse the window processing 
> such that the object to delta against is constant for the entire window 
> instead of the current logic where the target object is constant.  This 
> way there would be only one index in memory at all time.

Your are right.  The first led to the latter unexplored idea.

I expect to be offline most of the day today, and have other
things I can work on for the next few days anyway, so if you or
somebody else have an inclination and energy to reverse the
delta window, I would appreciate that.

Maybe the calling convention of diff-delta.c would become
something like this?

struct delta_index; /* opaque to the caller; implementation
		     * defines what's in it.
                     */

/* returns a newly allocated struct delta_index.
 * input "buf" pointer can be stored in the struct, but "buf"
 * does not belong to diff-delta module (i.e. borrowed reference).
 */
struct delta_index *delta_index(
  void *buf,			/* input: from buffer */
  unsigned long size,		/* input: from size */
);

/* ... so free the structure and its internal data, but
 * do not free the borrowed reference!
 */
void free_delta_index(struct delta_index *);

/* Take "from", an already preprocessed delta_index for the
 * traditional from_buffer/from_size, and to_buf/to_size, and
 * produce delta in newly allocated buffer (caller should
 * free() when it is done), and return the result size in
 * *delta_size.  Stop early if the result would exceed max_size.
 */
void *diff_delta(
    struct delta_index *from,	/* input: prepared by delta_index() */
    void *to_buf,		/* input: destination buffer */
    unsigned long to_size,	/* input: destination size */
    unsigned long *delta_size,	/* output: result size */
    unsigned long max_size	/* input: do not waste cycles if
                                   you cannot generate result
                                   smaller than this */
);

and the calling convention would be:

	struct unpacked *s, *d;
	unsigned long max_size;

	/* precompute the index */
	struct delta_index *src = delta_index(s->data, s->entry->size);

	/* do the delta */
        void *delta_buf = diff_delta(src, d->data, d->entry->size,
                                     &sz, max_size);
        /* do useful thing here on delta_buf and sz */
        free(delta_buf);

	/* the caller can reuse *src with other *d,
         * but when it is done...
         */
        free_delta_index(src);


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Apr 23 03:30:30 2006

This archive was generated by hypermail 2.1.8 : 2006-04-23 03:31:28 EST