Skip to content

scripts: update log format#4035

Merged
georglauterbach merged 10 commits intomasterfrom
lauter_g/update_log_format
May 29, 2024
Merged

scripts: update log format#4035
georglauterbach merged 10 commits intomasterfrom
lauter_g/update_log_format

Conversation

@georglauterbach
Copy link
Copy Markdown
Member

Description

Over time, I really came to despise the log format I implemented. I recently updated the log for another project of mine, and really wanted to get this into v14.0.0 (as potentially breaking).

The new format is more akin to Postfix's current format: <ISO 8601 TIMESTAMP> <LOG LEVEL> <WHO LOGGED>: <MESSAGE>. Having a timestamp and information about who actually logged is important; hence I included it.

Type of change

  • Improvement (non-breaking change that does improve existing functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary, I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have added information about changes made in this PR to CHANGELOG.md

OLD: [  WARN  ] <MESSAGE>
NEW: <ISO 8601 TIMESTAMP>  <LOG LEVEL>  <WHO LOGGED>: <MESSAGE>
Because we log with timestamps now by default, we do not need the extra
logging functionality anymore.
@georglauterbach georglauterbach added area/scripts kind/update Update an existing feature, configuration file or the documentation labels May 25, 2024
@georglauterbach georglauterbach added this to the v14.0.0 milestone May 25, 2024
@georglauterbach georglauterbach self-assigned this May 25, 2024
@georglauterbach georglauterbach mentioned this pull request May 25, 2024
4 tasks
@polarathene
Copy link
Copy Markdown
Member

Having a timestamp and information about who actually logged is important; hence I included it.

Theis information should be available with syslog IIRC, I forget the CLI command, but it's possible to call that to log a message that rsyslog should pick up and handle like it does other syslog compatible services like Postfix. That's where the actual timestamp and context should be coming from AFAIK?

Likewise, for stdout, Docker (and I assume k8s) should have their own logging drivers at least for timestamp tracking. Perhaps next month I'll finally get back to the Vector PR work upstream which will make the process a bit nicer for us 😅

polarathene
polarathene previously approved these changes May 26, 2024
Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger is the command, here is a nice overview (with CLI options shown at the bottom).

Syslog has two widespread formats, RFC 3164 (DMS prior to v14), and RFC 5424 (DMS v14 onwards default), where most should be using the proper standard RFC 5424. The priority value represents the combined facility (most services in DMS should be using mail IIRC) and the severity (your log levels).


Here is some stripped down reference from my Vector PR:

Snippets (click to expand)
enum Severity {
    Emergency = 0,
    Alert = 1,
    Critical = 2,
    Error = 3,
    Warning = 4,
    Notice = 5,
    #[default]
    Informational = 6,
    Debug = 7,
}
enum Facility {
    Kern = 0,
    #[default]
    User = 1,
    Mail = 2,
    Daemon = 3,
    Auth = 4,
    Syslog = 5,
    LPR = 6,
    News = 7,
    UUCP = 8,
    Cron = 9,
    AuthPriv = 10,
    FTP = 11,
    NTP = 12,
    Security = 13,
    Console = 14,
    SolarisCron = 15,
    Local0 = 16,
    Local1 = 17,
    Local2 = 18,
    Local3 = 19,
    Local4 = 20,
    Local5 = 21,
    Local6 = 22,
    Local7 = 23,
}

The Local* facilities are for custom/local usage if desired. The syslog service (rsyslog for us presently) can parse that to handle differently like it does for filtering mail logs to files, or forwarding to another log service if configured to do so.

Calculating the priority value looks like this:

// Example (Priority value for `Facility::Mail` + `Severity::Error` is `19`):
// `(2 * 8) + 3` or `(2 << 3) + 3` (bitwise shift instead of multiplication)
let priority = (self.facility as u8 * 8) + self.severity as u8;

Nevermind that, logger --priority mail.error is supported as a more friendly way for you to use, no math needed 😅 (use the relevant supported log levels from the collapsed snippet though)

Examples:

# RFC 5424 format (notq omits the structured data regarding time, msgid is rfc5424 only):
logger --rfc5424=notq --priority mail.error --id=1337 --msgid=hello --tag dms-changedetector "This is an example error message!"
<19>1 2024-05-26T11:45:28.305480+00:00 mail.example.test dms-changedetector 1337 hello - This is an example error message!

# Old RFC 3164 format (lacks a year, assumes present year):
logger --rfc3164 --priority mail.error --id=1337 --tag dms-changedetector "This is an example error message!"
<19>May 26 11:45:27 mail dms-changedetector[1337]: This is an example error message!
  • Use --stderr with the above logger command examples to get the output to stderr.
  • If you don't use notq, you'll get some metadata for a log ingester to have about quality of the timestamps ([timeQuality tzKnown="1" isSynced="1" syncAccuracy="157"]), this is Structured Data from RFC5424, doesn't really exist for prior RFC 3164.
  • --tag is the service name like amavis or postfix. Despite the linked resource examples, do not use spaces, that makes the log ambiguous to parse. The RFC document has a breakdown for supported tokens and expected sequence format.
  • --id isn't really needed, it's the PID by default logger omits it, PID of the logger command as default requires --id. Note that --id 42 is not valid, that's --id then 42 as part of the message rather than as part of --id, make sure assignments for the option have = 😑 (-i seems to be equivalent to --id too?)
  • --rfc5424 is not being parsed properly by rsyslog in :edge, stick to --rfc3164

UPDATE: Nevermind. I thought logger would be appropriate, but regardless of syslog format used, rsyslog is not stripping away some of the data like expected:

# RFC 3164 followed by RFC 5424

# v13.3:
May 26 11:51:14 mail mail dms-changedetector[1337]: This is an example error message!
May 26 11:55:07 mail 1 2024-05-26T11:55:07.509942+00:00 mail.example.test dms-changedetector 1337 - - This is an example error message!

# v14
2024-05-26T12:05:58.700336+00:00 mail mail dms-changedetector[1337]: This is an example error message!
2024-05-26T12:17:07.824094+00:00 mail 1 2024-05-26T12:17:07.824035+00:00 mail.example.test dms-changedetector 1337 hello - This is an example error message!

Pretty sure on a related PR I discussed this and documented the rsyslog config. Perhaps we should address that before pushing out v14 to use the syslog format? (I was under the impression it was effectively the same as the two but omitted the priority value, and for RFC 5424 the syslog version, eg <19>1)

Seems like I made a note of this on the rsyslog changelog entry actually 🤔


So just to clarify, your change here won't exactly line up to rsyslog format

  • You are using a two space delimiter.
  • You skip the host.
  • Your timestamp won't match.
  • You add your own log level (but so do other services with their message, so.._ 🤷‍♂️ )

Personally I prefer the current DMS log output which is more easier to distinguish if it's not going to align with rsyslog for log parsing (_at least with the current log output, you know that if the line starts with [ it's probably emitted by DMS, otherwise if a valid timestamp is the starting portion it's more than likely from rsyslog from one of the services we run. Anything else is usually stderr or similar and could be filtered separately.

Approving this change if you still feel you want to go through with it as-is, but you should wait for @casperklein to provide input too.

Comment thread CHANGELOG.md Outdated
Comment thread target/scripts/helpers/log.sh Outdated
Comment on lines +103 to +104
MESSAGE="$(date --iso-8601='seconds') ${!LOG_COLOR}${LOG_LEVEL_NAME}${RESET} "
MESSAGE+="$(basename "${0}" || echo 'unknown'): ${2}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of what this looks like? (yes I'm that lazy 😄 )

I'm not sure how useful it is to log a timestamp (and I assume the basename is script file?) to the TTY / stdout, it's often noise vs proper log management.

Since v14 will break expectations of logging elsewhere, I'm not going to bikeshed this, anything you're comfortable with is fine by me if no one else raises any objections.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the date should be date --rfc-3339='seconds' (ns would be more correct, but I'm not sure how you restrict the precision to 6 digits which is expected, ns precision is optional though)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of what this looks like? (yes I'm that lazy 😄 )

image

Not so bad, if someone was trying to parse these logs and separate them, they'll need to go based on two columns to match as DMS logs now.

Still the added verbosity of the timestamp (and a bit to the script name, but that's a little helpful), I'm not sure about. Adjusting the rsyslog config won't adjust the timestamp format here since this isn't being managed via rsyslog. If we had some consistency one could better parse the logs and filter/format them as they see fit.

@georglauterbach
Copy link
Copy Markdown
Member Author

@polarathene I adjusted the format to now look like this: <ISO 8601 TIMESTAMP> <LOG LEVEL> <WHO LOGGED>: <MESSAGE> (see my recent commits). The double-space has disappeared. I actually never tried to be rsyslog-compatible, but if you and @casperklein think it'd be a good idea, I can make that happen.

I would still like to go through with this, but I will wait for @casperklein's input on this as well, just as you requested 👍🏼

Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually never tried to be rsyslog-compatible, but if you and @casperklein think it'd be a good idea, I can make that happen.

I think it could be nice, but proper approach is to use logger as described above, which will require messing with rsyslog config to properly ingest.

As I want to replace rsyslog with Vector in future I don't know if it's worth your time to integrate with rsyslog, but if you can make it work that'd be cool 👍

Comment thread target/scripts/helpers/log.sh Outdated
Comment thread target/scripts/helpers/log.sh
Comment thread target/scripts/helpers/log.sh
Comment thread CHANGELOG.md Outdated
@casperklein
Copy link
Copy Markdown
Member

casperklein commented May 27, 2024

After @polarathene suggestions are applied, LGTM 👍

@georglauterbach
Copy link
Copy Markdown
Member Author

After @polarathene suggestions are applied, LGTM 👍

I applied two of the 4; the ones I did not apply are related to the space; I want the space there to have the log level always aligned to five characters.

@georglauterbach
Copy link
Copy Markdown
Member Author

@polarathene when we transition to Vector, all efforts of integrating our logs into Rsyslog would be wasted, hence I'd like to stick to what's currently on the branch and then convert to Vector once we're ready.

@polarathene
Copy link
Copy Markdown
Member

I want the space there to have the log level always aligned to five characters.

Maybe add a comment so others know that context? My first thought was there was trailing white-space, especially for diff in a review that didn't occur to me 😓

polarathene
polarathene previously approved these changes May 28, 2024
@georglauterbach georglauterbach enabled auto-merge (squash) May 28, 2024 17:19
@georglauterbach georglauterbach merged commit f8b3f40 into master May 29, 2024
@georglauterbach georglauterbach deleted the lauter_g/update_log_format branch May 29, 2024 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/scripts kind/update Update an existing feature, configuration file or the documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants