doc: Drop description of LogError messages as fatal #30361

ryanofsky · 2024-06-28T22:18:18Z

Remove documentation that says LogError should be used for "severe problems that require the node (or a subsystem) to shut down entirely" because:

It is not how LogError and Level::Error are used currently. Of 129 current uses only 58 cases are fatal according to #30348
"[T]here's not much benefit in a log that says "hey this error is about to cause the node to stop" -- you already get that information by seeing "Shutdown: in progress..." immediately following." according to #30347 (comment)

Fixes #30348

DrahtBot · 2024-06-28T22:18:21Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30361.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	maflcko
User requested bot ignore	l0rinc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

ajtowns · 2024-06-28T23:58:50Z

I think both entries should continue to include examples.

I think removing "severe" from the LogWarning description doesn't make sense -- anything the admin might have to address is a severe (potential) problem, isn't it?

Having LogError mean "the node admin will need to address" vs LogWarning mean "the node admin may need to address" seems to argue towards retaining the "LogError is fatal" interpretation to me -- fatal errors are ones the admin does need address; anything else they can ignore until it becomes fatal.

A potential advantage of having LogError (or LogFatal) reserved for fatal errors is that you could grep through the codebase for LogError to see all the potentially fatal errors, and perhaps work on finding ways to make them less fatal. Not convinced that's very beneficial though.

I don't think there's much other benefit to splitting potential problems up from actual problems otherwise if the admin still needs to attend to them either way, though; so if we're already making LogError just mean actual errors; is there any reason not to merge them both into a single LogAlert() or similar?

ryanofsky · 2024-06-30T15:55:57Z

re: #30361 (comment)

Thanks, added back examples and "severe".

I think probably just looking at AbortNode callers would be as effective as as searching for fatal error messages, if looking for places to make error handling more robust.

I do like the alert level idea and posted an implementation in #30364.

Updated 5175cbc -> b7aae36 (pr/nofat.1 -> pr/nofat.2, compare) with clarifications and examples.

Rebased b7aae36 -> 5e2eb5b (pr/nofat.2 -> pr/nofat.3, compare)

maflcko · 2024-07-16T20:12:54Z

It is not how LogError and Level::Error are used currently. Of 129 current uses only 58 cases are fatal according to #30348

The reason is that many stem from error(), see also #30364 (review).

"[T]here's not much benefit in a log that says "hey this error is about to cause the node to stop" -- you already get that information by seeing "Shutdown: in progress..." immediately following." according to #30347 (comment)

Agree, but then just go with #30364? Seems odd to change something twice, when it can be changed once?

weak NACK for now.

ryanofsky · 2024-08-16T21:58:29Z

re: #30361 (comment)

I think I might not have enough information to understand the weak nack, because I don't actually see a reason in there to reject a documentation fix that is only dropping advice to interpret Error as "fatal" and Warning as "nonfatal".

The advice is not followed by current code, but more importantly, I don't think anybody (even the original author) currently thinks this advice is something we should try to implement anymore. If something in this PR makes the documentation worse, it'd be helpful to know where the problems are. Otherwise this seems like a clean fix for the issue described in #30348, and a way to close that issue and move on to other fixes and improvements.

I do understand what you are saying about the error() removal in #29236 making the issue described in #30348 worse because it increased the amount of Error spam. But it's not like this PR is changing documentation to say that Error spam is ok. It is clearly not. It is just trying to remove a separate point of confusion.

maflcko · 2024-08-27T07:18:20Z

I think I might not have enough information to understand the weak nack

It is the line prior: "Seems odd to change something twice, when it can be changed once? weak NACK for now."

Now that the other pull is closed, I guess this is fine.

maflcko · 2024-08-27T07:23:35Z

lgtm ack b7aae36

Seems fine to change the documentation here to be a description of what the code does right now. Also, it seems fine for this description to be applied going forward, because both an "Error" or "Warning" should be considered an Alert for the node admin to investigate (and possibly dismiss) (with or without the node shutting down).

There are probably cases where an Error can be "downgraded" to a Warning in the code, but they can be done in follow-up changes.

ajtowns · 2024-08-27T09:22:45Z

I don't think anybody (even the original author) currently thinks this advice is something we should try to implement anymore.

Presuming that if it's me who's he-who-must-not-be-named here rather than referring back to #24464 etc, what I think is that if we're going to have distinct log levels for "error" and "warning", I think there should be a clear documented difference between what they are, so that PR authors, reviewers and users can know what to expect. If there's places in our code that don't meet that expectation, then that's something that should be fixed, though it doesn't need to be treated as some huge priority. On the other hand, if there's no distinction between the levels that's worth making, we shouldn't have two levels. So I think either the status quo or #30364 is better than this PR. It's not really worth arguing about, but it's frustrating having people put words in my mouth that aren't mine.

maflcko · 2024-09-18T08:45:14Z

So I think either the status quo or #30364 is better than this PR.

I tend to agree on a second thought. This pull just seems to increase the long-term confusion by treating the levels equivalent or similar, just because some parts of the code happen to confuse them (for various reasons).

Also, as pointed out in #30364 (comment) (and other comments) a good chunk of the log statements that this doc-only pull is trying to "fix" shouldn't be log statements at all.

So overall I am NACK -ish on increasing confusion in the docs, motivated by code that will be removed or changed long-term anyway.

l0rinc · 2024-09-18T12:43:23Z

While I understand why @maflcko doesn't want to change twice, this seems to me like a slight improvement.

utACK b7aae36

l0rinc · 2024-09-18T12:34:38Z

doc/developer-notes.md

+- `LogError(fmt, params...)` should be used in place of `LogInfo` for severe
+  errors the node admin will need to address (e.g., failure to write data).


nit: instead of seems slightly more common than in place of + that to streamline readability + needs to address to imply urgency:

Suggested change

- `LogError(fmt, params...)` should be used in place of `LogInfo` for severe

errors the node admin will need to address (e.g., failure to write data).

- `LogError(fmt, params...)` should be used instead of `LogInfo` for severe errors

that the node admin needs to address (e.g., failure to write data).

l0rinc · 2024-09-18T12:42:40Z

doc/developer-notes.md

+  should address (e.g. system time appears to be wrong, unknown soft fork
+  appears to have activated).

 - `LogTrace(BCLog::CATEGORY, fmt, params...)` should be used in place of


nit: the logger order is weird, why debug -> info -> error -> warning -> trace?
Especially after announcing LogInfo, LogDebug, LogTrace, LogWarning and LogError in the intro.

DrahtBot · 2025-04-01T14:17:06Z

Needs rebase, if still relevant

DrahtBot · 2025-06-30T01:23:12Z

🤔 There hasn't been much activity lately and the CI seems to be failing.

If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

Remove documentation that says LogError should be used for "severe problems that require the node (or a subsystem) to shut down entirely" because: - This is not how `LogError` and `Level::Error` are used currently. Of 129 current uses only 58 cases are fatal according to bitcoin#30348 - "[T]here's not much benefit in a log that says "hey this error is about to cause the node to stop" -- you already get that information by seeing "Shutdown: in progress..." immediately following." according to bitcoin#30347 (comment)

DrahtBot added the Docs label Jun 28, 2024

This was referenced Jun 28, 2024

logging: Use LogFatal instead LogError for fatal errors #30347

Closed

RFC: Misused LogError and LogWarning macros #30348

Open

DrahtBot mentioned this pull request Jun 29, 2024

kernel, logging: Pass Logger instances to kernel objects #30342

Draft

ryanofsky force-pushed the pr/nofat branch from 5175cbc to b7aae36 Compare June 30, 2024 15:47

ryanofsky marked this pull request as ready for review June 30, 2024 15:56

DrahtBot mentioned this pull request Jun 30, 2024

logging: Replace LogError and LogWarning with LogAlert #30364

Closed

ajtowns mentioned this pull request Jul 5, 2024

wallet: use LogTrace for walletdb log messages at trace level #30355

Merged

l0rinc approved these changes Sep 18, 2024

View reviewed changes

DrahtBot added the CI failed label Mar 17, 2025

DrahtBot mentioned this pull request Jul 23, 2025

doc: update headers and remove manual TOCs #33040

Merged

DrahtBot added the Needs rebase label Jul 24, 2025

maflcko removed the CI failed label Sep 26, 2025

ryanofsky force-pushed the pr/nofat branch from b7aae36 to 5e2eb5b Compare October 15, 2025 00:41

DrahtBot removed the Needs rebase label Oct 15, 2025

		- `LogError(fmt, params...)` should be used in place of `LogInfo` for severe
		errors the node admin will need to address (e.g., failure to write data).

doc: Drop description of LogError messages as fatal #30361

Are you sure you want to change the base?

doc: Drop description of LogError messages as fatal #30361

Conversation

ryanofsky commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

ajtowns commented Jun 28, 2024

Uh oh!

ryanofsky commented Jun 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Jul 16, 2024

Uh oh!

ryanofsky commented Aug 16, 2024

Uh oh!

maflcko commented Aug 27, 2024

Uh oh!

maflcko commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajtowns commented Aug 27, 2024

Uh oh!

maflcko commented Sep 18, 2024

Uh oh!

l0rinc commented Sep 18, 2024

Uh oh!

l0rinc Sep 18, 2024

Choose a reason for hiding this comment

Uh oh!

l0rinc Sep 18, 2024

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Apr 1, 2025

Uh oh!

DrahtBot commented Jun 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ryanofsky commented Jun 28, 2024 •

edited

Loading

DrahtBot commented Jun 28, 2024 •

edited

Loading

ryanofsky commented Jun 30, 2024 •

edited

Loading

maflcko commented Aug 27, 2024 •

edited

Loading