A number of journal fixes by poettering · Pull Request #4362 · systemd/systemd

poettering · 2016-10-12T18:37:34Z

Various fixes for the journal, in particular covering #4088 #4278 #4059 #4060

Let's make it easier to figure out when we see an invalid journal file, why we consider it invalid, and add some minimal debug logging for it. This log output is normally not seen (after all, this all is library code), unless debug logging is exlicitly turned on.

…y_index() This allows us to share a bit more code between journal_file_next_entry() and journal_file_next_entry_for_data().

…ction This adds a new call check_properly_ordered(), which we can reuse later, and makes the code a bit more readable.

Let's and extra check, reusing check_properly_ordered() also for journal_file_next_entry_for_data().

Let's make dissecting of borked journal files more expressive: if we encounter an object whose first 8 bytes are all zeroes, then let's assume the object was simply never initialized, and say so. Previously, this would be detected as "overly short object", which is true too in a away, but it's a lot more helpful printing different debug options for the case where the size is not initialized at all and where the size is initialized to some bogus value. No function behaviour change, only a different log messages for both cases.

…e keep going When iterating through partially synced journal files we need to be prepared for hitting with invalid entries (specifically: non-initialized). Instead of generated an error and giving up, let's simply try to preceed with the next one that is valid (and debug log about this). This reworks the logic introduced with caeab8f to iteration in both directions, and tries to look for valid entries located after the invalid one. It also extends the behaviour to both iterating through the global entry array and per-data object entry arrays. Fixes: systemd#4088

Let's use the earliest linearized event timestamp for journal entries we have: the event dispatch timestamp from the event loop, instead of requerying the timestamp at the time of writing. This makes the time a bit more accurate, allows us to query the kernel time one time less per event loop, and also makes sure we always use the same timestamp for both attempts to write an entry to a journal file.

…kwards As soon as we notice that the clock jumps backwards, rotate journal files. This is beneficial, as this makes sure that the entries in journal files remain strictly ordered internally, and thus the bisection algorithm applied on it is not confused. This should help avoiding borked wallclock-based bisection on journal files as witnessed in systemd#4278.

Never permit that we write to journal files that have newer timestamps than our local wallclock has. If we'd accept that, then the entries in the file might end up not being ordered strictly. Let's refuse this with ETXTBSY, and then immediately rotate to use a new file, so that each file remains strictly ordered also be wallclock internally.

Let's just say that the journal takes up space in the file system, not on disk, as tmpfs is definitely a file system, but not a disk. Fixes: systemd#4059

Fixes: systemd#4060

AsciiWolf · 2016-10-12T19:22:28Z

src/journal/journald-server.c

                return true;
+
+        case -ETXTBSY:         /* Journal file is from the future */
+                log_warning("%s: Journal file is from the future, rotateing.", f->path);


rotateing

Typo?

hese10 · 2016-10-13T11:06:32Z

I tested these changes for that hanging "--list-boots" case (#4278), and these changes prevented the forever loop in journalctl.

keszybz · 2016-10-13T11:45:42Z

Looks all good. Typo fixed in merge.

hese10 · 2016-11-22T09:49:42Z

It seems that these journald fixes do not help at least in openstack environment where we have different time zone (+8h into future) during boot up until VM gets correct time via NTP. In these cases we end up still hanging "journalctl --list-boots" command. Also we have problem with cursor parameter in journalctl in these cases.

hese10 · 2016-11-22T11:04:53Z

It seems "--list-boots" hangs due to corrupted journal file ("bad message" error for verify).

poettering · 2016-11-22T12:05:34Z

@hese10 if you want your issues not being ignored, file proper issues for them. First reproduce them on the most recent systemd version, and then do provide a minimal set of journal files that exposes the misbheaviour you are seeing. Thanks. If you just comment on some already-closed PR nobody will take notice.

poettering added 12 commits October 12, 2016 20:25

journal: split out array index inc/dec code into a new call bump_arra…

aa598ba

…y_index() This allows us to share a bit more code between journal_file_next_entry() and journal_file_next_entry_for_data().

journal: split out check for properly ordered arrays into its own fun…

b6da4ed

…ction This adds a new call check_properly_ordered(), which we can reuse later, and makes the code a bit more readable.

journal: also check that our entry arrays are properly ordered

ded5034

Let's and extra check, reusing check_properly_ordered() also for journal_file_next_entry_for_data().

journalctl: don't claim the journal was stored on disk

8da830b

Let's just say that the journal takes up space in the file system, not on disk, as tmpfs is definitely a file system, but not a disk. Fixes: systemd#4059

journalctl: say in which directory we vacuum stuff

3cc44bf

Fixes: systemd#4060

update TODO

da597d2

poettering added the journal label Oct 12, 2016

This was referenced Oct 12, 2016

[journald]: Failed to determine boots: No data available #4088

Closed

Avoid forever loop for journalctl --list-boots command #4278

Merged

AsciiWolf reviewed Oct 12, 2016

View reviewed changes

keszybz merged commit da597d2 into systemd:master Oct 13, 2016

keszybz added a commit that referenced this pull request Oct 13, 2016

Merge pull request #4362 from poettering/journalbootlistfix

c1a9199

keszybz mentioned this pull request Oct 15, 2016

journalctl behaves in unexpected ways if there are logs the future #2738

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A number of journal fixes#4362

A number of journal fixes#4362
keszybz merged 12 commits intosystemd:masterfrom
poettering:journalbootlistfix

poettering commented Oct 12, 2016

Uh oh!

AsciiWolf Oct 12, 2016

Uh oh!

hese10 commented Oct 13, 2016

Uh oh!

keszybz commented Oct 13, 2016

Uh oh!

hese10 commented Nov 22, 2016

Uh oh!

hese10 commented Nov 22, 2016

Uh oh!

poettering commented Nov 22, 2016

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

poettering commented Oct 12, 2016

Uh oh!

AsciiWolf Oct 12, 2016

Choose a reason for hiding this comment

Uh oh!

hese10 commented Oct 13, 2016

Uh oh!

keszybz commented Oct 13, 2016

Uh oh!

hese10 commented Nov 22, 2016

Uh oh!

hese10 commented Nov 22, 2016

Uh oh!

poettering commented Nov 22, 2016

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants