Skip to content

Conversation

@janaknat
Copy link
Contributor

This is used if the user ends an Aperf run using Ctrl+c or kill .

Ran aperf record and ctrl-c'd:
$ ./target/debug/aperf record -r ctrlcrecord
[2024-07-25T15:37:59Z INFO aperf_lib::record] Starting Data collection...
[2024-07-25T15:37:59Z INFO aperf_lib::record] Preparing data collectors...
[2024-07-25T15:37:59Z WARN aperf_lib::data::perf_stat] Instance does not expose Perf counters
[2024-07-25T15:37:59Z ERROR aperf_lib] Excluding perf_stat from collection. Error msg: No such file or directory (os error 2)
[2024-07-25T15:37:59Z INFO aperf_lib::record] Collecting data...
^C[2024-07-25T15:38:01Z INFO aperf_lib] Caught SIGINT. Exiting...
[2024-07-25T15:38:02Z INFO aperf_lib] Data collected in ctrlcrecord/, archived in ctrlcrecord.tar.gz
[2024-07-25T15:38:02Z INFO aperf_lib::record] Data collection complete.

Ran aperf record and send SIGTERM with kill:
$ ./target/debug/aperf record -r sigtermrecord -p 100
[2024-07-25T15:40:34Z INFO aperf_lib::record] Starting Data collection...
[2024-07-25T15:40:34Z INFO aperf_lib::record] Preparing data collectors...
[2024-07-25T15:40:34Z WARN aperf_lib::data::perf_stat] Instance does not expose Perf counters
[2024-07-25T15:40:34Z ERROR aperf_lib] Excluding perf_stat from collection. Error msg: No such file or directory (os error 2)
[2024-07-25T15:40:35Z INFO aperf_lib::record] Collecting data...
[2024-07-25T15:40:51Z INFO aperf_lib] Caught SIGTERM. Exiting...
[2024-07-25T15:40:51Z INFO aperf_lib] Data collected in sigtermrecord/, archived in sigtermrecord.tar.gz
[2024-07-25T15:40:51Z INFO aperf_lib::record] Data collection complete.

ctrlcrecord.tar.gz
sigtermrecord.tar.gz

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@janaknat janaknat requested a review from a team as a code owner July 25, 2024 15:44
Copy link
Contributor

@wash-amzn wash-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do something about 'perf' if it's running (--profile). It should be sent a signal at the end of collection and wait()ed on to exit.

It's current behavior where we give it an amount of time to run should be a backstop in case aperf crashes. Normally it should be our signal to it that actually stops it.

if poll(&mut poll_fds, PollTimeout::NONE)? <= 0 {
error!("Poll error.");
}
if let Some(ev) = poll_fds[0].revents() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit nicer here to use timer_pollfd instea dof poll_fds[0]

And if possible, I'd change the argument to poll() to be the literal [timer_pollfd, signal_pollfd] (if it will let you, I'm not sure it will), to emphasize the unimportance of that array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It won't let me do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's a way to make this less fragile

Comment on lines 324 to 328
} else {
error!("Caught unknown signal: {}. Exiting...", siginfo.ssi_signo);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't registering an interest in any other signals, so this is impossible/dead code, I think it's fine to just omit it. Or replace it with a panic since it's not at all expected to happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. We'll panic. Makes sense to become ugly if we're given a signal which we didn't register for.

@janaknat
Copy link
Contributor Author

We need to do something about 'perf' if it's running (--profile). It should be sent a signal at the end of collection and wait()ed on to exit.

It's current behavior where we give it an amount of time to run should be a backstop in case aperf crashes. Normally it should be our signal to it that actually stops it.

We could default to always sending a SIGTERM to all the children. If we get a SIGINT, we can instead send that.

This is used if the user ends an Aperf run using Ctrl+c or kill <pid>.
We send the same signal to any child processes launched by aperf.
@janaknat janaknat merged commit 4b910d2 into main Jul 26, 2024
@janaknat janaknat deleted the ctrl-c branch September 11, 2024 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants