Re: [PATCH 1/4] Add git-archive

From: Junio C Hamano <>
Date: 2006-09-08 12:35:12
Franck Bui-Huu <> writes:

> The main reason for making this API is to avoid using
> git-{tar,zip}-tree commands, hence making them useless. Maybe it's
> time for them to die ?

The answer is "not yet" and the above paragraph, at least the
last sentence and half, do not belong to the commit log message.

> It also implements remote operations by defining a very simple
> protocol: it first sends the name of the specific uploader followed
> the repository name (git-upload-tar git://
> Then it sends options. It's done by sending a sequence of one
> argument per packet, with prefix "argument ", followed by a flush.

I haven't looked the existing code closely recently, but my
impression was that if you want to make the protocol operable
with both git-daemon, ssh target, and local pipe, it is easier
to make the request message exactly like you would invoke the
remote command over the ssh connection in the target repository
(see connect.c).  I am not sure how well the above
"git-upload-tar reponame" would work.  I would have expected it
to be "git-upload-archive reponame" with the first on-protocol
parameter being "--fmt=tar".

Does your code work well when you run the remote archive
fetching locally, i.e. "git-archive --remote=../other.git",
I wonder?

... goes on and reads the patch, notices that the protocol
command is git-upload-archive with the archive location.  Which
is GOOD.  It is just the above description is a tad stale.

> diff --git a/archive.h b/archive.h
> new file mode 100644
> index 0000000..f33398e
> --- /dev/null
> +++ b/archive.h
> @@ -0,0 +1,41 @@
> +#ifndef ARCHIVE_H
> +#define ARCHIVE_H
> +
> +#define MAX_EXTRA_ARGS	32
> +#define MAX_ARGS	(MAX_EXTRA_ARGS + 32)
> +
> +struct archiver_args {
> +	const char *base;
> +	struct tree *tree;
> +	const unsigned char *commit_sha1;
> +	time_t time;
> +	const char **pathspec;
> +	void *extra;
> +};
> +
> +typedef int (*write_archive_fn_t)(struct archiver_args *);
> +
> +typedef void *(*parse_extra_args_fn_t)(int argc, const char **argv);
> +
> +struct archiver {
> +	const char *name;
> +	const char *remote;
> +	struct archiver_args args;
> +	write_archive_fn_t write_archive;
> +	parse_extra_args_fn_t parse_extra;
> +};
> +
> +extern struct archiver archivers[];

I thought the reason for archiver_args (and archiver_args.extra)
was because we wanted to avoid storing per invocation parameters
in static variables, which would hamper reentrancy.  If one
process is creating two archives, both format=tar, it might be
reasonable for the code (future archiver enhancement, not your
current implementation of git-archive driver) to parse two sets
of parameters first (to get separate archiver instances) and
call their write_archive, but if archivers[] list has the
per-invocation parameter args then we are back to square one,
aren't we?

Reentrancy may not matter, but in any case the above archiver_args
is not helping enough to improve the situation, I think.

Actually you may be able to get away by returning a copy of
archivers[] element from get_archiver() when we need reentrancy
in the future.  Of course the caller needs to free() it when it
is done with it, since it is a per-invocation handle.

> +void parse_treeish_arg(const char **argv, struct archiver_args *ar_args,
> +		       const char *prefix)
> +{
> +...
> +	//free(tree);
> +	ar_args->tree = tree;
> +	ar_args->commit_sha1 = commit_sha1;
> +	ar_args->time = archive_time;
> +}

Stray comment...

Overall looks good, except you already know your issue #4 ;-).

I haven't had a chance to look at connect.c code but I have a
mild suspicion that full reuse of upload-pack code by moving
everything into connect.c is not possible; at least the initial
handshake to determine if sideband is to be used is specific to
upload-pack protocol which needed to bolt-it-on to an existing
protocol to support older clients and servers, and I do not
think we would want to carry that baggage for this new protocol.

The pipe setup code in upload-pack.c::create_pack_file() is
quite specific to upload-pack.  I am not sure how much of it can
be reused by refactoring, but it may be worth a try.

The part that reads from fd 1 and 2 and to multiplex into the
stream on the uploader side (the main while() loop in the same
create_pack_file() function, and send_client_data() function),
and the code to setup and demultiplex the bands on the
downloader side (setup_sideband() in fetch-clone.c) should be
reusable as-is, I think.  They are defined as static so you
would need to move the code around to make them available from

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at
Received on Fri Sep 08 12:35:31 2006

This archive was generated by hypermail 2.1.8 : 2006-09-08 12:36:26 EST