Don't use logrotate or newsyslog in this century.

logrotate and newsyslog are a design that should be consigned to history. They are examples of how incremental changes to the way that things are done can have quite poor end results. Having the log-rotater program separate from the log-writer program results in several unavoidable potential problems. This design was superseded by better designs in the middle 1990s.

Background

The original way of logging the output of a Unix program was very simple. One just redirects its standard output to a file, or through a pipe. (For ease of comprehension, these are in more modern shell language.)

my-program > myprogram.log
my-program | tee myprogram.log

The invention of standard error for Sixth Edition slightly complicates this, but not by very much.

my-program > myprogram.log 2>&1
my-program 2>&1 | tee myprogram.log

The problem is that this file grows forever. There's nothing to be gained from deleting it; Unix keeps the i-node around because the program's standard output is pointed to it, and so one just ends up with an inaccessible deleted file with a zero link count. There's little to be gained from truncating it, even, as the file pointer for the output file description is not repositioned to the beginning of the file and one ends up with a (sparse) file that has a large hole of zero(-fill)ed blocks at the start. A bunch of NULs is not pleasant for Unix text processing programs written in the C language.

Hence the arrival of newsyslog in the BSD world, and logrotate in the GNU world. These programs are run by some form of periodic scheduler, such as cron. At regular intervals, the log files on the system are checked to see whether they have exceeded some size limit, and "rotated" if they have. Rotation involves keeping a number of generations of old log files, deleting older generations as necessary to make way for newer ones.

This "rotation" can take two forms.

To the naïve observer it appears that everything is good; logs do not grow indefinitely, their previous generations are limited, truncation does not result in (sparse file) NUL-filled "holes", and it even seems that this dovetails nicely with existing tools. It appears that this ties up nicely with two "follow" modes of the tail command, which can therefore be used to monitor such logs as they are written to.

tail -f

tail watches a single, open, file for size changes. It can either grow larger by having fresh log entries appended, or reset to zero when truncated. This would seem to exactly match the copy-and-truncate-in-place form of log rotation.

tail -F

tail does everything as for -f, but also watches for the file being renamed or deleted. If it is, it attempts to re-open the current log file afresh, opening the new file that is now at that name. This would seem to exactly match the rename-away-and-create-anew form of log rotation.

However, this analysis is superficial; and this is not in fact the case.

Problems

There are two classes of problems with this.

Size capping and copying are unreliable

Because the log-rotator is separate from the log-writer there is a window between the log-writer exceeding the size limit of a log file and the log-rotator's next scheduled run. Even were the log-rotator run every minute, which is far more frequently than these programs are run in the wild, that's enough, at modern computer speeds, for a runaway log writer to write gigabinarybytes of log entries and blast right through the size limits on log files that the log-rotator is supposed to be enforcing.

A problem in the same class is that in the copy-and-truncate-in-place model, the copy is racing against the log-writer that is still writing to the log as it is being copied. The copier cannot reliably know when it has read all of the way to the end of the file, and there is an unavoidable window between the copy being considered done and the log being truncated where the log-writer can butt in and append more log entries, that are never copied and in fact immediately lost from the log. (Rainer Gerhards, author of rsyslog, once called this "a very, very, very bad choice" that is "inherently racy and using it is almost a guarantee that you will lose log data".)

tail -F cannot deal with all eventualities

Whatever tail -F does, it cannot cope with all of the possible things that can happen. As a result, it ends up skipping log entries, and possibly even entire log files. (Bryan Cantrill gave a talk about BSD tail and log files at Surge 2013, but he is far from the only person to have hit these problems; and what he explains about BSD tail holds true for GNU tail as well, which gives up, with comments about this in the code, handling the problem situations for much the same reason that Bryan Cantrill did.)

The right way to handle log file rotation.

These problems were addressed in the mid 1990s, by going back to the original mechanism with the following principle in mind:

The log rotation and the log writing must be done by the same code, so that it knows exactly when to do a rotation, does it exactly then, and does not race with a parallel process doing the other half of the task.

One can see this in the designs of the "next generation" syslog utilities that came about a few years later, which made a point of advertising that they don't need external log rotation tools. But one can also see this in a design also from the mid 1990s.

It is important to realize that Doug McIlroy's "garden hosepipe" design still applies. The log rotation and log writing code do not have to be in the program generating the log entries. They can be attached to that program, with an ordinary Unix pipe, just like tee was further up this page.

The realization then dawns that what is simply needed is (in another instance of Doug McIlroy's "garden hosepipe" metaphor) a better tee program. And this is what came about with Daniel J. Bernstein's daemontools, which began with a program named cyclog and progressed to a program named multilog. There is now a whole suite of logging tools in the daemontools family of toolsets from which one can select according to one's needs, from cyclog to s6-log. Apart from one that didn't have a size limit mechanism, these all share the core semantics of being "tee plus". They write to one or more files; but they monitor the sizes of the files that they have written, and before they would exceed a size limit they stop writing and perform a rotation, switching to the fresh current file immediately and before recommencing writing, once the rotation is complete.

(For rotating by date, rather than in order to impose a cap on overall log file disc space consumption, there is also Andrew Ford's cronolog adapted from an Apache tool, which has the same design of one program acting like tee and doing both the writing and the rotating.)

This fixes most of the problems. (It has benefits in its own right, nothing to do with rotation, too. Some filesystems and storage devices give better results if files are only written to once, sequentially. And the notion that once renamed away a old generation log file is never supposed to be written to, can be both relied upon, by programs that can assume that old log files are static snapshots, and enforced, by programs that do things like sign logs.) It fixes all of the problems that are caused by the log-writer not being in lockstep with the log-rotator. It uses only rename-away-and-create-anew rotation, so there is no difficulty with having to accommodate the problems relating to copying and truncation.

Fixing tail -F is slightly harder, because tail -F is a general purpose program that does not understand the log rotation scheme. To fix the problem with skipping over old log files when throughput is high, tail -F would have to know the names of all of the old generation log files and scan through all of them (that it had not yet processed) in the correct order before switching back to processing the currently active log file again. But tail -F has no such hardwired knowledge, which is specialized to the one case of log files. (This becomes clearer if one looks and the disparity between how one tells the likes of multilog to write a log and how one tells tail -F to read it. multilog is told the name of the entire log directory, and expected to understand its structure. tail -F is only told the name of one individual file in the log directory, and isn't even aware that there is a daemontools-style log directory.)

Thus came about tools such as follow-log-directories in the nosh toolset. This does have particular knowledge of the log rotation system employed in daemontools-family loggers, and of TAI64N timestamps, and so can know where to look for old files that it needs to process before it can proceed with processing the current file. (It is told about entire log directories, and expected to understand their structures.)

Getting it wrong again.

Prasad J. Pandit took Daniel J. Bernstein's djbdns in 2012 and modified it into a tool suite that xe named ndjbdns. One of the first things that xe did was to remove the use of multilog and go back to using a separate log rotator. Obviously enough, this retrogressive step, that undid nearly two decades of learning better than to have a separate log rotator, brought back the aforementioned problems; as M. Pandit eventually discovered a year or so later. This was never fixed, though, and remains a problem in ndjbdns.


© Copyright 2016,2017 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.