Skip to content

Avoid forever loop for journalctl --list-boots command#4278

Merged
poettering merged 1 commit intosystemd:masterfrom
hese10:master
Oct 12, 2016
Merged

Avoid forever loop for journalctl --list-boots command#4278
poettering merged 1 commit intosystemd:masterfrom
hese10:master

Conversation

@hese10
Copy link
Contributor

@hese10 hese10 commented Oct 4, 2016

When date is changed in system to future and normal user logs to new journal file, and then date is changed back to present time, and reboot is given, the "journalctl --list-boot" command goes to forever loop. This commit tries to fix this problem by checking first the boot id list if the found boot id was already in that list. If it is found, then stopping the boot id find loop.
The steps to reproduce the problem
0) Fedora 24, with multi-user.target set as default. root user with password and one normal user.

  1. reboot system
  2. Login as root user
  3. delete all journal files from /var/log/journal//
    rm -rf /var/log/journal//
  4. restart journald service:
    systemctl restart systemd-journald
  5. check that no user_.journal files are in /var/log/journal/// directory.
    ls -la /var/log/journal/_/user*.journal
  6. check also that no user journal files are open:
    lsof | grep journal | grep user
  7. change date to future:
    sudo date -s "31 Dec 2016 21:00:00"
  8. logout root user
  9. login as normal user
  10. make test log to journal
    logger "This is future log from normal user"
  11. logout normal user
  12. login as root user
  13. change date back to today:
    sudo date -s "04 Oct 2016 10:15:00"
  14. reboot the fedora:
    reboot
  15. login as root user
  16. give that problematic "--list-boots”, it will now hang:
    journalctl --list-boots

@zonque zonque added the journal label Oct 4, 2016
break;
}
} else {
i=0;
Copy link
Member

@zonque zonque Oct 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespaces before and after =, please.

} else {
i=0;
LIST_FOREACH(boot_list, id, head) {
if ( sd_id128_equal(id->id, current->id) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No whitespaces after ( and before ).

/* exiting as otherwise this problem would cause forever loop */
goto finish;
}
i++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something or is i never actually used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it was not used. So now it is removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add such cleanups to your patches as extra commits but squash them together. In your case, this should be only one commit, not 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it now ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems sane to me, except that the id variable could be local to that new if branch. As I am not familiar with the in-depth details of the journal implementation, I'd like to get a final Ack from @poettering.

When date is changed in system to future and normal user logs to new journal file, and then date is changed back to present time, the "journalctl --list-boot" command goes to forever loop. This commit tries to fix this problem by checking first the boot id list if the found boot id was already in that list. If it is found, then stopping the boot id find loop.
@poettering
Copy link
Member

@hese10 hmm, any chance you could provide me with a minimal set of journal files that expose this problem? Your patch looks superficially OK, but I wonder if there's also some deeper problem here, that needs fixing.

if you don't want to make your journals public, please consider sending them to me personally (email), I'll not make them public (though I will of course look at them...).

To make the set minimal, consider copying hte journal directory somewhere, containing all files, and then removing files from it until the problem goes away, and the reinstating the file at fault, and so on, until you have the minimal set of files that result in this issue

thanks!

@poettering poettering added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Oct 11, 2016
@hese10
Copy link
Contributor Author

hese10 commented Oct 11, 2016

@poettering, here are the minimal journal files for the problem case:
journal_files.zip

@poettering
Copy link
Member

OK, so this apparently happens since our bisection logic relies on properly ordered journal files to work. However, if the realtime clock jumps, in the middle of the file this is not the case anymore, as newer entries might have a smaller timestamp thatn older ones then.

Your patch is effective in making the listing work anyway, by detecting such loops and giving up then. And I think the patch is good that, and will merge it.

However, I also think that we really shouldn't generate such journal files which aren't properly ordered. Hence I'll prep a patch that will automatically rotate (i.e. close a journal file, and start a new one) each time the clock jumps backwards. This way the bisection tables have guaranteed strict order, and the issue should go way for future files. Your patch will make existing files work.

(Implicit, automatic rotating on clock change has been on the TODO list for a long time actually, but it was never implemented, since it so far was only a theoretical issue, that didn't become a problem in real life. Turns out it actually is a real-life issue.)

@poettering poettering merged commit ec02a6c into systemd:master Oct 12, 2016
poettering added a commit to poettering/systemd that referenced this pull request Oct 12, 2016
…kwards

As soon as we notice that the clock jumps backwards, rotate journal files. This
is beneficial, as this makes sure that the entries in journal files remain
strictly ordered internally, and thus the bisection algorithm applied on it is
not confused.

This should help avoiding borked wallclock-based bisection on journal files as
witnessed in systemd#4278.
@poettering
Copy link
Member

The suggested rotate-on-time-jump logic is now waiting in #4362

whot pushed a commit to whot/systemd that referenced this pull request Oct 10, 2017
When date is changed in system to future and normal user logs to new
journal file, and then date is changed back to present time, the
"journalctl --list-boot" command goes to forever loop. This commit tries
to fix this problem by checking first the boot id list if the found boot
id was already in that list. If it is found, then stopping the boot id
find loop.

(cherry picked from commit ec02a6c)

Conflicts:
	src/journal/journalctl.c
Related: #1294516
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

journal needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer

Development

Successfully merging this pull request may close these issues.

3 participants