bundy

* Modular * Extensible * Friendly *

[Bundy-hackers] Centralized logging process (Fw: [bundy] lettuce test failures due to corrupted log output (#4))

Shane Kerr shane at time-travellers.org
Sun May 4 22:52:56 CEST 2014


All,

Jinmei noticed problems with Bundy tests...

-----------------------------------------------------------------------
Begin forwarded message:

Date: Thu, 01 May 2014 22:55:48 -0700
From: JINMEI Tatuya <notifications at github.com>
To: bundy-dns/bundy <bundy at noreply.github.com>
Subject: Re: [bundy] lettuce test failures due to corrupted log output
(#4)


(hmm, this part of the first comment was removed, probably due to the
garbage output).

I've not figured out the root cause, but I suspect this is probably
because we use a terminal (stderr) for the log output.  Inside the test
we copy the stderr to another file and parse it, so it would make more
sense to use normal files as the log output directly.  Some experiments
showed it in fact produces much more reliable results (if not
completely solve the issue).
-----------------------------------------------------------------------

We had a long, sad history with logging in BIND 10.

My original thinking was that we should have each process log directly,
to avoid unnecessary data copies, context switches, stalling on
inter-process communication, and so on. Plus (probably the most
important), it avoids an extra single point of failure.

This caused a lot of issues, which we never actually managed to solve
completely. We ended up adding our own locking mechanisms, which are
also present in the latest versions of log4cplus (our underlying
logging library). (The last known issue here was I believe that the C++
I/O library was calling the underlying Unix system calls in a
non-intuitive, making both the log4cplus locking and our own extra
locking fail when going to STDIO/STDERR.)

In retrospect I realize that I failed to do something which was
important from the very beginning of the project... TEST ASSUMPTIONS.

It should be possible to benchmark logging directly using log4cplus and
compare to a centralized logging server. This could reveal the actual
performance costs (if any) to such a model.

My guess is that logging through a centralized logging server is not
much slower than logging directly from each process. Given that we
could remove file locking primitives, perhaps a centralized logger
could even be faster. Unless it is a LOT faster, we'd probably want to
still avoid centralized logging when possible because of the single
point of failure. 

However, since we seem to constantly have problems with STDIO/STDERR,
perhaps it would make sense to centralize logging only for those
streams.

Thoughts?

--
Shane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bundy-dns.de/pipermail/bundy-hackers/attachments/20140504/22ec0b83/attachment.html>