Avoid forever loop for journalctl --list-boots command#4278
Avoid forever loop for journalctl --list-boots command#4278poettering merged 1 commit intosystemd:masterfrom
Conversation
src/journal/journalctl.c
Outdated
| break; | ||
| } | ||
| } else { | ||
| i=0; |
There was a problem hiding this comment.
Whitespaces before and after =, please.
src/journal/journalctl.c
Outdated
| } else { | ||
| i=0; | ||
| LIST_FOREACH(boot_list, id, head) { | ||
| if ( sd_id128_equal(id->id, current->id) ) { |
There was a problem hiding this comment.
No whitespaces after ( and before ).
src/journal/journalctl.c
Outdated
| /* exiting as otherwise this problem would cause forever loop */ | ||
| goto finish; | ||
| } | ||
| i++; |
There was a problem hiding this comment.
Am I missing something or is i never actually used anywhere?
There was a problem hiding this comment.
yes, it was not used. So now it is removed.
There was a problem hiding this comment.
Please don't add such cleanups to your patches as extra commits but squash them together. In your case, this should be only one commit, not 3.
There was a problem hiding this comment.
The code seems sane to me, except that the id variable could be local to that new if branch. As I am not familiar with the in-depth details of the journal implementation, I'd like to get a final Ack from @poettering.
When date is changed in system to future and normal user logs to new journal file, and then date is changed back to present time, the "journalctl --list-boot" command goes to forever loop. This commit tries to fix this problem by checking first the boot id list if the found boot id was already in that list. If it is found, then stopping the boot id find loop.
|
@hese10 hmm, any chance you could provide me with a minimal set of journal files that expose this problem? Your patch looks superficially OK, but I wonder if there's also some deeper problem here, that needs fixing. if you don't want to make your journals public, please consider sending them to me personally (email), I'll not make them public (though I will of course look at them...). To make the set minimal, consider copying hte journal directory somewhere, containing all files, and then removing files from it until the problem goes away, and the reinstating the file at fault, and so on, until you have the minimal set of files that result in this issue thanks! |
|
@poettering, here are the minimal journal files for the problem case: |
|
OK, so this apparently happens since our bisection logic relies on properly ordered journal files to work. However, if the realtime clock jumps, in the middle of the file this is not the case anymore, as newer entries might have a smaller timestamp thatn older ones then. Your patch is effective in making the listing work anyway, by detecting such loops and giving up then. And I think the patch is good that, and will merge it. However, I also think that we really shouldn't generate such journal files which aren't properly ordered. Hence I'll prep a patch that will automatically rotate (i.e. close a journal file, and start a new one) each time the clock jumps backwards. This way the bisection tables have guaranteed strict order, and the issue should go way for future files. Your patch will make existing files work. (Implicit, automatic rotating on clock change has been on the TODO list for a long time actually, but it was never implemented, since it so far was only a theoretical issue, that didn't become a problem in real life. Turns out it actually is a real-life issue.) |
…kwards As soon as we notice that the clock jumps backwards, rotate journal files. This is beneficial, as this makes sure that the entries in journal files remain strictly ordered internally, and thus the bisection algorithm applied on it is not confused. This should help avoiding borked wallclock-based bisection on journal files as witnessed in systemd#4278.
|
The suggested rotate-on-time-jump logic is now waiting in #4362 |
When date is changed in system to future and normal user logs to new journal file, and then date is changed back to present time, the "journalctl --list-boot" command goes to forever loop. This commit tries to fix this problem by checking first the boot id list if the found boot id was already in that list. If it is found, then stopping the boot id find loop. (cherry picked from commit ec02a6c) Conflicts: src/journal/journalctl.c Related: #1294516
When date is changed in system to future and normal user logs to new journal file, and then date is changed back to present time, and reboot is given, the "journalctl --list-boot" command goes to forever loop. This commit tries to fix this problem by checking first the boot id list if the found boot id was already in that list. If it is found, then stopping the boot id find loop.
The steps to reproduce the problem
0) Fedora 24, with multi-user.target set as default. root user with password and one normal user.
rm -rf /var/log/journal//
systemctl restart systemd-journald
ls -la /var/log/journal/_/user*.journal
lsof | grep journal | grep user
sudo date -s "31 Dec 2016 21:00:00"
logger "This is future log from normal user"
sudo date -s "04 Oct 2016 10:15:00"
reboot
journalctl --list-boots