Re: [ANNOUNCE] CGit v0.1-pre

From: Lars Hjemli <hjemli@gmail.com>
Date: 2006-12-12 04:40:48
On 12/11/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
> On Mon, 11 Dec 2006, Lars Hjemli wrote:
> > Anyway, I must say I find it rather unlikely for these cases to occur
> > (frequently) in real life. That would seem to imply that the caching
> > isn't really needed at all.
>
> The point is, if you have races, you will hit them _occasionally_. It may
> not be a performance problem in real life, BUT:
>
>  - quite often, you can take advantage of the serialization guarantees
>    that a front-side cache offers you, and do the uncached accesses (or
>    writing the final cache-file) knowing that there's only one thing that
>    does that.
>
>  - If so, then a race condition in the cache goes from a "unlikely
>    performance problem" to a BUG.

Yes, I finally understood (see my other reply)


> > > As a side note: how do you release your caches?
> >
> > Simple timeouts (time()-stat.st_mtime), depending on what kind of page
> > was requested. If anyone cares about invalid cache content (branch
> > head moving), relevant cachefiles can be deleted with an update-hook.
>
> I was more thinking about the locking part. Again, to be safe, you should
> probably take the lock before releasing any cache.

Ok. Code speeks louder than words, so I'll blatantly paste the key functions:

--->8----

const int NOLOCK = -1;

int cache_lock(struct cacheitem *item)
{
	int i = 0;
	char *lockfile = fmt("%s.lock", item->name);

 top:
	if (++i > cgit_max_lock_attempts)
		die("cache_lock: unable to lock %s: %s",
		    item->name, strerror(errno));

       	item->fd = open(lockfile, O_WRONLY|O_CREAT|O_EXCL, S_IRUSR|S_IWUSR);

	if (item->fd == NOLOCK && errno == ENOENT && cache_create_dirs())
		goto top;

	if (item->fd == NOLOCK && errno == EEXIST &&
	    cache_refill_overdue(lockfile) && !unlink(lockfile))
			goto top;

	return (item->fd > 0);
}

int cache_unlock(struct cacheitem *item)
{
	close(item->fd);
	return (rename(fmt("%s.lock", item->name), item->name) == 0);
}


static void cgit_check_cache(struct cacheitem *item)
{
	int i = 0;

	cache_prepare(item);
 top:
	if (++i > cgit_max_lock_attempts) {
		die("cgit_refresh_cache: unable to lock %s: %s",
		    item->name, strerror(errno));
	}
	if (!cache_exist(item)) {
		if (!cache_lock(item)) {
			sleep(1);
			goto top;
		}
		if (!cache_exist(item))
			cgit_fill_cache(item);
		cache_unlock(item);
	} else if (cache_expired(item) && cache_lock(item)) {
		if (cache_expired(item))
			cgit_fill_cache(item);
		cache_unlock(item);
	}
}

--->8----

I am a bit conserned about the effect of cache_unlink() if another
concurrent process gets "lucky" with the test
"cache_refill_overdue(lockfile) && !unlink(lockfile)".

This is supposed to be a safety valve against a stale lock file (lock
file not modified in n secs), but it doesn't feel quite right.

Probably, if cache_unlink() fails in cgit_check_cache(), I should "goto top".

Opinions?

-- 
larsh
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Dec 12 04:40:59 2006

This archive was generated by hypermail 2.1.8 : 2006-12-12 04:44:02 EST