Sunday, April 5, 2009

Fixing the disk code, part 2

I've been reviewing and documenting the store client code to get a feel for how the code works and is used. Sure enough, it relies heavily on the callback data invalidation magic to avoid having to wait around until the store disk file operations complete (or an event scheduled via eventAdd() to complete at a later date.) I added a little bit of logging to print out a warning if a pending disk operation was there and sure enough, it happens often enough in production to probably cause all those random crashes that I remember in Squid-1.2.

Anyway. I have an idea for eliminating the read copy in the async IO code. Something I've done in other C programs is this:
  • Track the pending events which the store client code schedules callbacks into itself (ie, the eventAdd() for later processing, and pending disk IO)
  • Split storeClientUnregister() into two parts - one which sets a "shutting down" flag, and another which does the hard work
  • assert() that API methods aren't called (ie, storeClientRef(), etc) if the shutting down flag is set - ie, they shouldn't be at the present time, so any instance of that happening means that some callback is occuring where it shouldn't be and getting further than it should be!
  • On event/disk callback completion, clear the relevant flags and see if the store client is completely ready to be destroyed. Only then destroy it.
This -should- preserve the current behaviour (callbacks will return immediately instead of not being called, so they effectively behave the same) but it won't prevent non-callback code from running if the callback data pointer is "valid". (Which, hopefully isn't actually happening, but god knows in this codebase..) It means that the callback data "is valid" check in the store client code (and hopefully, the disk IO code) really becomes a debugging check rather than being used to control whether the callback is made or not.

I'm just worried that there's going to be weirdness in how the store client code is being called. Grr.

Anyway. If this is successful and proves itself in production, I'll then change the disk APIs to also enforce that the callback data remains valid for the lifetime of the scheduled IO. Which IMHO it should be. As I said earlier, using tricks like reference counted buffers shouldn't be a crutch for a bad API design..

No comments:

Post a Comment