-
Notifications
You must be signed in to change notification settings - Fork 565
Nice Error Messages on Exit for Cloud Hypervisor and ch-remote #7066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nice Error Messages on Exit for Cloud Hypervisor and ch-remote #7066
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
I like the new verbose nested error stack message. At first glance, I find it a bit less obvious that there was an actual error than with the older output, though. I would find it more appealing if the message was prefixed: The additional debug output looks pretty redundant, so having it once for the outermost error in a single line should be sufficient from my perspective. Being able to see the actual image path in the error stack is definitely a big plus (but you don't need the debug info for that)! That's my 2ct. I don't have very strong opinions on this. |
This comment was marked as outdated.
This comment was marked as outdated.
|
@phip1611 Cool idea - one issue might be that currently send the "Error C: Error B: Error A" style output in the HTTP messages when an error occurs - can you have a think how this would work for those situations? |
64f09a4 to
478909a
Compare
src/main.rs
Outdated
| eprintln!(" 0: {top_error}"); | ||
| let mut level = 1; | ||
| let mut next_error: &dyn StdError = &top_error; | ||
| while let Some(sub_error) = next_error.source() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe a combination of std::iter::successors and enumerate can streamline this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know std::iter::successors, cool! Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record: This lead to the discovery of a Rust compiler bug 😀
Nevertheless, a neat suggestion and now used in the code - thanks!
24771c9 to
0c1defb
Compare
3410083 to
b245038
Compare
@rbradford @likebreath While the prerequisites aren't all merged yet, this still has progressed. The changes (IMHO massively) improve the quality and helpfulness of error messages on exit for:
All of that is reflected in the examples in the PR description. |
b245038 to
519bd9c
Compare
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
The changes were mostly automatically applied using the Python script mentioned in the first commit of this series. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
With the foundations of each error type implementing std::error::Error,
we can now nicely walk the `.source()` chain and print an error trace.
This commit introduces improved user-facing error printing when:
- Cloud Hypervisor fails with an error
- ch-remote fails (client error)
- ch-remote fails (remote error)
The additional context is a clear improvement in UX for both users and
developers. In the following example, the new behaviour is shown for
a direct invocation of Cloud Hypervisor leading to a failure. This
looks similar for ch-remote.
```
Old Style
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs /etc/bootitems/linux/initrd_minimal/default`
Error booting VM: VmBoot(LockingError(BlockError(LockDiskImage(AlreadyLocked)))
```
```
`target/release/cloud-hypervisor --api-socket /tmp/chv2.sock --kernel /etc/bootitems/linux/kernel_minimal/stable.bzImage --cmdline console=ttyS0 --serial tty --console off --disk path=img.raw --initramfs
/etc/bootitems/linux/initrd_minimal/default`
Error: Cloud Hypervisor exited with the following chain of errors:
0: Error booting VM
1: The VM could not boot
2: Error locking disk images: Another instance likely holds a lock
3: Cannot lock images of all block devices
4: Failed to get Write lock for disk image: ./img.raw
5: The file is already locked
Debug Info: VmBoot(VmBoot(LockingError(DiskLockError(LockDiskImage { error: AlreadyLocked, lock_type: Write, path: "./raw_disk.bin" })))
```
Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
While working on this, I found a subtle but severe compiler bug [0]. To fight the bug with explicitness rather than implicitness (to prevent weird stuff in the future), this change is beneficial. The bug is at least in Rust stable 1.34..1.87. [0]: rust-lang/rust#141673 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Remote server errors are transferred as raw HTTP body. This way,
we lose the nested structured error information.
This is an attempt to retrieve the errors from the HTTP response
and to align the output with the normal error output.
For example, this produces the following chain of errors. Note
that everything after level 0 was retrieved from the HTTP server
response:
```
Error: ch-remote exited with the following chain of errors:
0: http client error
1: Server responded with InternalServerError
2: Error from API
3: The disk could not be added to the VM
4: Failed to validate config
5: Identifier disk1 is not unique
Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("Error from API<br>The disk could not be added to the VM<br>Failed to validate config<br>Identifier disk1 is not unique")))
```
In case the JSON can't be parsed properly, ch-remote will print:
```
Error: ch-remote exited with the following chain of errors:
0: http client error
X: Can't get remote's error messages from JSON response: EOF while parsing a value at line 1 column 0: body=''
Debug Info: HttpApiClient(ServerResponse(InternalServerError, Some("")))
```
Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
cfdac58 to
234ca64
Compare
|
Very cool that we got this merged. A major UX improvement for users and developers. One remark tho: The description of the PR and the commit message still contain the old style with |
This partially reverts ed8f347 from cloud-hypervisor#7183 and 6277d7d from cloud-hypervisor#7201. # Output how it was merged for v47 (cloud-hypervisor#7066) ``` Error: Cloud Hypervisor exited with the following chain of errors: 0: Error booting VM 1: The VM could not boot 2: Error manipulating firmware file 3: No such file or directory (os error 2) Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` # Output after cloud-hypervisor#7183 and cloud-hypervisor#7201 ``` cloud-hypervisor: 31.385730ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:27 -- Error: Cloud Hypervisor exited with the following chain of errors: cloud-hypervisor: 31.417961ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 0: Error booting VM cloud-hypervisor: 31.448078ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 1: The VM could not boot cloud-hypervisor: 31.486711ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 2: Error manipulating firmware file cloud-hypervisor: 31.513331ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 3: No such file or directory (os error 2) cloud-hypervisor: 31.548037ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:44 -- cloud-hypervisor: 31.568045ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:45 -- Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` The "proper logger" has indeed the advantage that messages can be gracefully redirected to log files etc. However, this makes the error message hardly readable. Therefore, I propose to use error!() only for runtime errors that are non-critical, i.e., messages whose error cause will not lead to a VMM shutdown. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
This partially reverts ed8f347 from cloud-hypervisor#7183 and 6277d7d from cloud-hypervisor#7201. # Output how it was merged for v47 (cloud-hypervisor#7066) ``` Error: Cloud Hypervisor exited with the following chain of errors: 0: Error booting VM 1: The VM could not boot 2: Error manipulating firmware file 3: No such file or directory (os error 2) Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` # Output after cloud-hypervisor#7183 and cloud-hypervisor#7201 ``` cloud-hypervisor: 31.385730ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:27 -- Error: Cloud Hypervisor exited with the following chain of errors: cloud-hypervisor: 31.417961ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 0: Error booting VM cloud-hypervisor: 31.448078ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 1: The VM could not boot cloud-hypervisor: 31.486711ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 2: Error manipulating firmware file cloud-hypervisor: 31.513331ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 3: No such file or directory (os error 2) cloud-hypervisor: 31.548037ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:44 -- cloud-hypervisor: 31.568045ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:45 -- Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` The "proper logger" has indeed the advantage that messages can be gracefully redirected to log files etc. However, this makes the error message hardly readable. Therefore, I propose to use error!() only for runtime errors messages but not a pretty-printed version of those. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
This partially reverts ed8f347 from cloud-hypervisor#7183 and 6277d7d from cloud-hypervisor#7201. # Output how it was merged for v47 (cloud-hypervisor#7066) ``` Error: Cloud Hypervisor exited with the following chain of errors: 0: Error booting VM 1: The VM could not boot 2: Error manipulating firmware file 3: No such file or directory (os error 2) Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` # Output after cloud-hypervisor#7183 and cloud-hypervisor#7201 ``` cloud-hypervisor: 31.385730ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:27 -- Error: Cloud Hypervisor exited with the following chain of errors: cloud-hypervisor: 31.417961ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 0: Error booting VM cloud-hypervisor: 31.448078ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 1: The VM could not boot cloud-hypervisor: 31.486711ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 2: Error manipulating firmware file cloud-hypervisor: 31.513331ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 3: No such file or directory (os error 2) cloud-hypervisor: 31.548037ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:44 -- cloud-hypervisor: 31.568045ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:45 -- Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` The "proper logger" has indeed the advantage that messages can be gracefully redirected to log files etc. However, this makes the error message hardly readable. Therefore, I propose to use error!() only for runtime errors messages but not a pretty-printed version of those. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
This partially reverts ed8f347 from #7183 and 6277d7d from #7201. # Output how it was merged for v47 (#7066) ``` Error: Cloud Hypervisor exited with the following chain of errors: 0: Error booting VM 1: The VM could not boot 2: Error manipulating firmware file 3: No such file or directory (os error 2) Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` # Output after #7183 and #7201 ``` cloud-hypervisor: 31.385730ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:27 -- Error: Cloud Hypervisor exited with the following chain of errors: cloud-hypervisor: 31.417961ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 0: Error booting VM cloud-hypervisor: 31.448078ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 1: The VM could not boot cloud-hypervisor: 31.486711ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 2: Error manipulating firmware file cloud-hypervisor: 31.513331ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:39 -- 3: No such file or directory (os error 2) cloud-hypervisor: 31.548037ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:44 -- cloud-hypervisor: 31.568045ms: <main> ERROR:/home/pschuster/dev/cloud-hypervisor/src/lib.rs:45 -- Debug Info: VmBoot(VmBoot(FirmwareFile(Os { code: 2, kind: NotFound, message: "No such file or directory" }))) ``` The "proper logger" has indeed the advantage that messages can be gracefully redirected to log files etc. However, this makes the error message hardly readable. Therefore, I propose to use error!() only for runtime errors messages but not a pretty-printed version of those. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Building on the foundations introduced with #7061, #7069, #7085, #7089, and some more, we can now print nice error messages when Cloud Hypervisor or ch-remote exit with an error.
This is a major step forward in UX for both users and developers. It will be clearer where an error originates from and still all debugging information will be preserved.
Changes
Hints for Reviewers
Please review this commit by commit. The interesting commits are easily detectable as they don't follow the series naming scheme.
Steps to Undraft and to Merge
Python Script to change
Display::fmt()of all ErrorsThe python script is embedded into the first commit message of the series changing the behavior.
Old Style (Cloud Hypervisor Invocation Fails)
New Style (Cloud Hypervisor Invocation Fails)
! NOTE THAT THE FOLLOWING EXAMPLES MIGHT BE OUTDATED AS THE PR PROGRESSES !
New Style (ch-remote Invocation Fails)
New Style (ch-remote shows API failure on from remote)