Skip to content

common: enable log to journald#39738

Merged
tchaikov merged 4 commits intoceph:masterfrom
SMIL-Infra:journald
Apr 5, 2021
Merged

common: enable log to journald#39738
tchaikov merged 4 commits intoceph:masterfrom
SMIL-Infra:journald

Conversation

@huww98
Copy link
Contributor

@huww98 huww98 commented Feb 27, 2021

Enable ceph daemons to directly send logs to journald via unix domain socket.

While sending logs, metadata like priority, thread, timestamp is sent as structured data. And can be queried by journalctl.

Note that I don't use libsystemd because I want the implementation to be as efficient as possible.

We can work around the issue stated in #39547 if we let daemon talk to journald directly

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

I have used vstart.sh to verify that this works on my PC. Since this depends on external service journald, could you comment on whether I need to write tests? If so, how?


Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@huww98 huww98 requested a review from a team as a code owner February 28, 2021 04:33
@tchaikov tchaikov requested review from tchaikov and removed request for a team February 28, 2021 04:45
@tchaikov tchaikov removed the crimson label Feb 28, 2021
@batrick batrick self-requested a review February 28, 2021 21:59
Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments so far

@sebastian-philipp
Copy link
Contributor

sebastian-philipp commented Mar 1, 2021

Speaking of cephadm here. In the new world™ of systemd, daemons are supposed to write to stdout. And stdout is then supposed to end up in journald. For cephadm, I'd like to keep in this way. I mean, to some degree, this PR is a workaround for containers/conmon#242 + https://tracker.ceph.com/issues/49551 .

On the other hand, I can see that this log driver might be useful for folks in general. Especially the added metadata looks interesting. Just out of curiosity, where are the benefits for cephadm users? Maybe you want to add this beneath https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-stdout ?

@huww98
Copy link
Contributor Author

huww98 commented Mar 1, 2021

@sebastian-philipp For traditional daemons, they can detect whether the stdout is connected to journald. If yes, then send additional metadata. However, it can be hard for daemons in containers to detect whether they should send additional metadata. Take podman as an example. The stdout of daemons is connected to conmon, the conmon then send the content to log files / journald and the stdout of the podman process (if not detached).

For cephadm users, I think we can mount journald socket into the container, then enable the new "logging to journald" feature in this PR.

@huww98
Copy link
Contributor Author

huww98 commented Mar 3, 2021

Apart from the above comments, I also made some other changes:

  • Add fall back for kernels that do not support memfd_create (< 3.17)
  • Add a small benchmark program
  • Add some unit tests.

return LOG_NOTICE;
if (ceph_prio < 10)
return LOG_INFO;
return LOG_DEBUG;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been undecided on the wisdom of spewing Ceph debug logs to the systemd journal (with cephadm deployed containers or otherwise). Is this really recommended practice?

The mapping you've done is interesting but not necessarily wrong. I think if we have debug_ms=5 you will get some amusing spam in the journal which pretends to be important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been undecided on the wisdom of spewing Ceph debug logs to the systemd journal

Why not. journald itself can do filtering both on collecting time and viewing time based on its structured metadata. And we are already doing this with the default cephadm deployment.

I think if we have debug_ms=5 you will get some amusing spam in the journal which pretends to be important.

prio=5 is mapped to LOG_INFO, which is the default level if we just connect the stderr to journald. I think it is OK. Anyway, if someone don’t like some logs, they can do the filtering themselves. This can be easier compared with logging to files.

By the way, the journald driver of conmon log everything at LOG_ERROR level. I think this is bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas and @sebastian-philipp what do you think? i think this change is a nice improvement. but i am not sure regarding @batrick 's question:

Is this really recommended practice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we log to journald instead of a file, we need to be able to write anything (including debugs) to journald that we would write to the files.

Btw we have a similar function in Python already:

def _ceph_log_level_to_python(self, log_level: str) -> str:
if log_level:
try:
ceph_log_level = int(log_level.split("/", 1)[0])
except ValueError:
ceph_log_level = 0
else:
ceph_log_level = 0
log_level = "DEBUG"
if ceph_log_level <= 0:
log_level = "CRITICAL"
elif ceph_log_level <= 1:
log_level = "WARNING"
elif ceph_log_level <= 4:
log_level = "INFO"
return log_level

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we log to journald instead of a file, we need to be able to write anything (including debugs) to journald that we would write to the files.

This is not necessarily true. We could have a debug log for true "debug" information and a status/error/audit log go to systemd journal. Merging them is convenient for programmers but not necessarily a good thing for users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending log messages to different targets based on the level is something that none of the logging code currently supports. I'm not sure if it makes sense or not, but I think even if we decide to go down that road it's an orthogonal change to this one

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huww98 added 4 commits March 9, 2021 20:14
Enable ceph daemons to directly send logs to journald via unix domain socket.

While sending logs, metadata like priority, thread, timestamp is sent as
structured data. And can be queried by journalctl.

Note that I don't use libsystemd because I want the implementation to be as
efficient as possible.

Signed-off-by: 胡玮文 <[email protected]>
Use the same infrastructure used for local logs. Also with structured metadata.

Signed-off-by: 胡玮文 <[email protected]>
Simple program to send a lot log messages to journald.

On my PC, it runs at about 35000 log entries per second.

Signed-off-by: 胡玮文 <[email protected]>
These tests are skipped if journald is not run.

Signed-off-by: 胡玮文 <[email protected]>
@huww98
Copy link
Contributor Author

huww98 commented Apr 5, 2021

Hi, what else can I do? As we already sending every log to journald via stderr when deployed with cephadm, I think we don't need to worry about sending too many logs to journald. Besides, this feature is opt-in.

If this PR and #40025 get merged, I will plan another PR to enable this by default when deployed with cephadm. (Seems #40025 is nessesary for these logs to be associated with the corresponding systemd unit.

@tchaikov tchaikov merged commit 9ef8055 into ceph:master Apr 5, 2021
@tchaikov
Copy link
Contributor

tchaikov commented Apr 6, 2021

@ivancich thanks Eric, i am working on a fix to address the FTBFS.

@tchaikov
Copy link
Contributor

tchaikov commented Apr 6, 2021

#40607

@huww98 huww98 deleted the journald branch April 6, 2021 15:21
@ivancich
Copy link
Member

ivancich commented Apr 6, 2021

@tchaikov Thanks for the quick fix, Kefu.

@wjwithagen
Copy link
Contributor

@huww98 @tchaikov
What would be the easiest way to "undo" this journald stuff on platforms that do not have journald and all the memfd stuff.
Now this block building on at least FreeBSD.

Perhaps the easiest way is to stub-out all routines in Journald.cc ??

@ivancich
Copy link
Member

ivancich commented Apr 7, 2021

@huww98 @tchaikov
What would be the easiest way to "undo" this journald stuff on platforms that do not have journald and all the memfd stuff.
Now this block building on at least FreeBSD.

Perhaps the easiest way is to stub-out all routines in Journald.cc ??

@wjwithagen: Have you seen @tchaikov's Windows fix in PR #40607? I would guess could be adapted for your case.

@wjwithagen
Copy link
Contributor

@ivancich
I had not, but it is a good suggestion...
Had found HAVE_MSGHDR, but onfortunately that is also available in the FreeBSD includes.
But thanx for the useful hint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants