Skip to content

fix: subtract 1 from call chain addresses other than the first one#43

Merged
milianw merged 1 commit intoKDAB:hotspotfrom
pcc:fix4
Jul 3, 2025
Merged

fix: subtract 1 from call chain addresses other than the first one#43
milianw merged 1 commit intoKDAB:hotspotfrom
pcc:fix4

Conversation

@pcc
Copy link
Contributor

@pcc pcc commented Jun 2, 2025

For stack frames other than the first one, ip is a return address, so it points to the instruction after the call instruction. By using the return address as is, we were incorrectly attributing these call chain samples to the instruction following the call. Therefore, subtract 1 from the return address so that it gets correctly attributed to the call instruction.

For stack frames other than the first one, ip is a return address, so it
points to the instruction after the call instruction. By using the return
address as is, we were incorrectly attributing these call chain samples to
the instruction following the call. Therefore, subtract 1 from the return
address so that it gets correctly attributed to the call instruction.
@pcc
Copy link
Contributor Author

pcc commented Jun 2, 2025

Reproducer:

$ cat 1.c
int x;

void f() {
  x++;
}

__attribute__((weak))
void force_frame_pointer() {}

__attribute__((noinline))
void infloop() {
#ifdef FP
  force_frame_pointer();
#endif
  // Use asm to prevent the compiler from optimizing away the rest of main.
#ifdef __aarch64__
  __asm__ __volatile__("b .");
#elif defined(__x86_64__)
  __asm__ __volatile__("jmp .");
#endif
}

int main() {
  infloop();
  f();
}
$ clang -fuse-ld=mold 1.c -gdwarf-4 -DFP -fno-omit-frame-pointer -O
$ perf record -g ./a.out
$ hotspot perf.data

Without this change: f appears in flame graph even though it is never executed.

With this change: f does not appear.

@GitMensch
Copy link
Contributor

GitMensch commented Jun 5, 2025

Interesting. I do wonder if/how it would adjust the flame graphs in general and if this is only clang/mold related...

I think that sample could be added to the tests as well, no?

@milianw
Copy link
Contributor

milianw commented Jul 3, 2025

Reproducer:
...
Without this change: f appears in flame graph even though it is never executed.

With this change: f does not appear.

Ideally this could get turned into an auto test, i.e. compile this reproducer into a static binary then record it with perf and then write a test that tries to analyze the now-portable results. we have some test code that stringifies the results and that should nicely show the before/after too

@milianw
Copy link
Contributor

milianw commented Jul 3, 2025

that said, I'm fine with merging this as-is and trusting your judgement on this front, so thanks a lot for your contribution! considering my lack of time, I don't want to block you on writing the tests, but it would be great to have. for now I'll create a task out of this.

@milianw milianw merged commit 9fee213 into KDAB:hotspot Jul 3, 2025
@milianw
Copy link
Contributor

milianw commented Jul 3, 2025

PS: the test would live in hotspot btw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants