Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Support for CFI unwind info#2868

Merged
kyulee1 merged 1 commit intodotnet:masterfrom
kyulee1:cfi
Jan 28, 2016
Merged

Support for CFI unwind info#2868
kyulee1 merged 1 commit intodotnet:masterfrom
kyulee1:cfi

Conversation

@kyulee1
Copy link

@kyulee1 kyulee1 commented Jan 26, 2016

For Unix targeting CoreRT, this will provide platform specific unwind
info which is a dwarf format.
Unlike window’s UNWIND_INFO, we won’t encode the format within RyuJit
which is not only complex/error-prone but also will be slightly
different depending on platforms.
Instead, CFI pseudo instructions are encoded, which will be passed to
CoreRT/ObjectWriter whi will translate them to directly emit CFI
instructions in LLVM MC to establish frames.
Then LLVM MC will handle the details of laying out the eh_frame
sections.
One a JitFlag is used to dynamically dispatch either Windows’s unwind
blob or this CFI instruction table. No JIT/EE interface change is needed
since we’ve already passing an opaque blob.
Initially when I looked at what Clang does, the prologue and the
sequence of CFI emissions are a bit different than Windows.
Since we will emit the same sequence of code with the same runtime that
we define, I assume this is unnecessary – I’ve verified the CFI sequence
here can work correctly with libunwind.
Basically we need only 3 operations – push reg is a combination of 1 and
2 below.

  1. Allocation – increase stack frame. Normally subtract esp in x64.
  2. Store – Copy a (callee save) register to stack (memory)
  3. SaveFP – Set frame pointer register
    Since Windows operation is based on the relative value (all offsets,
    etc are computed from the current point), I also use the similar form of
    CFI instructions .
    So, mostly one window’s code corresponds to two CFIs – we might optimize
    this by aggregating allocation, but I’d like to keep the current
    syntax/semantic same as Windows.
    This results in a very simple transformation in par with Windows.

src/inc/corjit.h Outdated
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to Pat who will expand more Jit flags, I use this unused flag, instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doesn't this just replace the CORJIT_FLG_UNUSED7 definition above, such that the 0x00004000 values are adjacent?

@kyulee1
Copy link
Author

kyulee1 commented Jan 26, 2016

@dotnet/jit-contrib PTAL. The corresponding encoder for CFI is in LLILC repo -- dotnet/llilc#1011

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these XMM encodings correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I validated this CFI assembly directive. Dwarf XMM0 starts with 17 instead of REG_XMM0 (16). Note actually since all xmm are volatile in Unix ABI, we don't save them at all so this part is actually not used, but leave it for consistency.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why not just use a static table of mappings? If that's not possible, then I think I'd prefer that each case simply returned rather than mutating a local and leaving the return to the end. Something like:

assert(reg >= 0 && reg <= REG_XMM15);

switch (reg)
{
case REG_RAX: return 0;
case REG_RCX: return 2;
case REG_RDX: return 1;
case REG_RBX: return 3;
case REG_RSP: return 7;
case REG_RBP: return 6;
case REG_RSI: return 4;
case REG_RDI: return 5;
default:
    return reg < REG_XMM0 ? reg : reg + 1; // shift by 1 due to dwarf_reg_ip
}

@BruceForstall
Copy link

I don't like how this looks. You're creating both CFI and Windows unwind info all the time, but only using one. Why don't you have a completely separate implementation of the unwind APIs which compiled for CFI generation? That is, one unwindAllocStack for Windows, one for CFI. Do you need to choose between these dynamically? If so, then have the AMD64 unwindAllocStack() dynamically choose between unwindAllocStackWindows() and unwindAllocStackUnix() (for example). Sounds like a good use of function pointers or virtual functions.

src/inc/clrnt.h Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better for this structure to live in separate file, maybe that is local to the JIT.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Probably instead of copying OpType directly from LLVM, I may introduce my own enum just for 3 cases for now. This way we can avoid license concern as well as don't need to worry about syncing this definition when LLVM source is changed.

@kyulee1
Copy link
Author

kyulee1 commented Jan 27, 2016

@BruceForstall Both CFI and Windows unwind code is not completely orthogonal. Windows unwind code/emission sequence is used to emit epilogue, etc. So, I'd like to keep Window's unwind code process intact. Here CFI emission is not complete blob in the sense it's not usual until it is laid out by ObjectWriter/LLVM. So, think this CFI emission is a code-generation mode (under CoreRT) which tides to Window's code sequence,
Splitting unwind infö based on platform, e.g, WIndows/Unix is also confusing since we always use Windows unwind info on CoreCLR/Unix.
In short,

  1. default mode (regular Jit mode on CLR), we process/publish Window Unwind Info.
  2. under this flag (e.g, CoreRT/Unix), we process Window Unwind info (for consistent code-gen) as well as CFI unwind info, but publish the latter only.

I am also not a big fan of virtual function stuff, and I don't see much value on this simple case.

@pgavlin
Copy link

pgavlin commented Jan 27, 2016

/cc @cmckinsey @russellhadley

we process Window Unwind info (for consistent code-gen)

I'm not sure what you mean by this. It looks like consistent code generation should be ensured by the unwind info interface (i.e. the various Compiler::unwind* functions that you've changed). I also think that we should only be generating one style of unwind info or the other.

Do you need to choose between these dynamically?
...
I am also not a big fan of virtual function stuff, and I don't see much value on this simple case

I'm not sure that the former question has been answered yet. This really boils down to "do we want to be able to build RyuJIT s.t. a single build may generate code for multiple target platforms". AFAIK this is not possible today, and I'd be hesitant to make significant moves in that direction without strong justification (e.g. easier cross compilation). Just to be clear, I am not advocating that we further bifurcate the JIT/EE interface using #ifdefs (which we unfortunately have already done): I'm just suggesting that we think hard before doing the opposite to the JIT implementation and measure afterwards.

In terms of the overall direction: I thought that we were intending to introduce our own CFI codes that were compatible with what is expressible in DWARF and Windows unwind information s.t. we could have RyuJIT emit format-agnostic codes. Is this still the plan? If so, did it just so happen that DWARF CFI is expressive enough to cover this design?

@BruceForstall
Copy link

I talked to @kyulee1 in person; he can elaborate what he'll do (or, just make the changes and update the PR).

@pgavlin: I'm not in the loop for CoreRT, but I thought we had no plans to change how we generate Windows unwind codes. non-Windows codes could be pseudo-codes that work broadly on non-Windows platforms. Also, it turns out for non-Windows, CoreCLR (but not CoreRT), we do need to generate Windows unwind codes. And a desire for single RyuJIT binary that works for CoreCLR or CoreRT per-platform then requires dynamically generating one form or the other.

@pgavlin
Copy link

pgavlin commented Jan 27, 2016

I thought we had no plans to change how we generate Windows unwind codes

That's certainly correct when targeting CoreCLR et. al. I think this is still up in the air for CoreRT, however.

Also, it turns out for non-Windows, CoreCLR (but not CoreRT), we do need to generate Windows unwind codes

Yes, that's correct.

And a desire for single RyuJIT binary that works for CoreCLR or CoreRT per-platform

That's what I was missing! Thanks.

@kyulee1
Copy link
Author

kyulee1 commented Jan 28, 2016

@BruceForstall As discussed, I updated PR.

  • cfi.h is introduced to contain cfi data structure which will be consumed by object writer as well.
  • cfiCodes are now dynamically allocated using a vector.
  • ifdefed-out to split Windows and CFI code path. Note UNIX build implies both paths which is dynamically selected by a Jit flag.

There are some disruptiveness and code duplication due to splitting functions (for Windows/CFI), but I think they are reasonably small.
I think ideally we want a unwinder object that is attached to compiler object so that we can easily specialize unwinder logic per targets/platforms. But this refactoring seems beyond the scope of this PR, which we can come back later to clean this up.

@kyulee1 kyulee1 force-pushed the cfi branch 3 times, most recently from b3c60d9 to acc3f57 Compare January 28, 2016 15:52
src/inc/cfi.h Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stephen Toub changed all the header file license agreements yesterday, so you should update this to match the new standard.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceForstall
Copy link

I like the change. Modulo the few comments, it looks good to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any estimate on how many CFI codes we'll typically need for a function? If so, can we give this vector a reasonable initial capacity?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the minimal bar. Many small functions need very small CFIs like 1 ~2. But complex functions with lots of eh could need many. @BruceForstall Can you suggest what the reasonable size for this vector?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Assuming it is 1-to-1 CFI-to-Windows unwind codes, you could probably do a pretty simple textual analysis of the generated ngen-based asm diffs files to figure this out.

@pgavlin
Copy link

pgavlin commented Jan 28, 2016

LGTM modulo the comments above.

For Unix targeting CoreRT, this will provide platform specific unwind
info which is a dwarf format.
Unlike window’s UNWIND_INFO, we won’t encode the format within RyuJit
since it is not only complex/error-prone but also needs to be adjusted
depending on platforms.
Instead, CFI (call frame information) pseudo instructions are encoded,
which will be passed to CoreRT/ObjectWriter which will translate
them to directly emit CFI directives to LLVM MC stream to establish
frames and layout eh_frame section.
A jit flag is used to dynamically dispatch either Windows’s unwind
blob or this CFI instruction table. No JIT/EE interface change is needed
since the API already expects an opaque blob.
Initially when I looked at what Clang does, the prologue and the
sequence of CFI emissions are a bit different than Windows.
Since we will emit the same sequence of code with the same runtime that
we define, I assume this is unnecessary – I’ve verified the CFI sequence
here can work correctly with libunwind.
Basically we need only 3 operations – push reg is a combination of 1 and
2 below.
 1. Allocation – increase stack frame. Normally subtract esp in x64.
 2. Store – Copy a (callee save) register to stack (memory)
 3. SaveFP – Set frame pointer register

Since Windows operation is based on the relative value (all offsets,
etc are computed from the current point), I also use the similar form of
CFI instructions.
So, mostly one window’s code corresponds to two CFIs – we might optimize
this by aggregating allocation, but I’d like to keep the current
syntax/semantic same as Windows.
This results in a very simple transformation on par with Windows.
@kyulee1 kyulee1 force-pushed the cfi branch 2 times, most recently from acc3f57 to b5f2fda Compare January 28, 2016 21:10
@kyulee1
Copy link
Author

kyulee1 commented Jan 28, 2016

@BruceForstall @pgavlin Updated the change per feedbacks.

  • Change banner
  • Change pointer location and struct definition
  • Use full expansion of cases for dwarf reg conversion
  • Simplify code path by sharing local variables
  • Initial vector size is 5 based on investigation on a few examples
  • Didn't hoist the common part of code on purpose in order to split the code path.

@pgavlin
Copy link

pgavlin commented Jan 28, 2016

LGTM. Thanks for doing this!

@BruceForstall
Copy link

LGTM

@kyulee1
Copy link
Author

kyulee1 commented Jan 28, 2016

Failure seems unrelated.

kyulee1 added a commit that referenced this pull request Jan 28, 2016
Support for CFI unwind info
@kyulee1 kyulee1 merged commit 18c6d19 into dotnet:master Jan 28, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants