Skip to content

Conversation

@shushanhf
Copy link
Contributor

fix the building error for "PalInterlockedCompareExchange128".

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 29, 2024
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Sep 29, 2024

Is __sync_val_compare_and_swap on LoongArch64 implemented as an atomic operation?

@shushanhf
Copy link
Contributor Author

shushanhf commented Sep 29, 2024

Is __sync_val_compare_and_swap on LoongArch64 implemented as an atomic operation?

The cas-128 instruction is only supported from 3A6000, which is added within ISA1.1.
For 3A5000 which is ISA1.0, __sync_val_compare_and_swap will call the libatomic.

@jkotas
Copy link
Member

jkotas commented Sep 29, 2024

It means that this change will fix the build break, but there is a correctness problem at runtime on systems before 3A6000. The one place where this method is used requires the updates to be atomic.

@shushanhf
Copy link
Contributor Author

It means that this change will fix the build break,

Yes, this is very clear.

but there is a correctness problem at runtime on systems before 3A6000. The one place where this method is used requires the updates to be atomic.

If calling the libatomic which implements this by mutex_lock is not correct, I think there might be a correctness problem.

@jkotas
Copy link
Member

jkotas commented Sep 29, 2024

If calling the libatomic which implements this by mutex_lock is not correct, I think there might be a correctness problem.

Add a TODO comment and open an issue about this?

@shushanhf
Copy link
Contributor Author

If calling the libatomic which implements this by mutex_lock is not correct, I think there might be a correctness problem.

Add a TODO comment and open an issue about this?

OK, I will add a TODO and an issue later.
I have a question, why the libatomic using mutex_lock to emulate the cas-128bit instruction is not correct here.
I think using the libatomic is just lower performance.

@jkotas
Copy link
Member

jkotas commented Sep 29, 2024

why the libatomic using mutex_lock to emulate the cas-128bit instruction is not correct here.

The one and only place that calls PalInterlockedCompareExchange128 requires the update to be atomic:

// Helper function for updating two adjacent pointers (which are aligned on a double pointer-sized boundary)
// atomically.
. If the 128-bit update is not atomic, it will result into intermittent crashes.

It should be possible to create a variant of the calling code for platforms that are missing atomic 128-bit update.

@am11
Copy link
Member

am11 commented Sep 29, 2024

but there is a correctness problem at runtime on systems before 3A6000.

Since LA64's NativeAOT support is new (still yet to pass all tests), maybe we can make 3A6000 as LA64 baseline for AOT?

@jkotas
Copy link
Member

jkotas commented Sep 29, 2024

Our default strategy is to have the same min supported hardware and OS versions for both regular CoreCLR and NAOT. You can certainly do something else for Loongarch, but you will have to explain these differences to Loongarch users.

@shushanhf
Copy link
Contributor Author

Our default strategy is to have the same min supported hardware and OS versions for both regular CoreCLR and NAOT. You can certainly do something else for Loongarch, but you will have to explain these differences to Loongarch users.

Thanks.

I will add some TODO comments to explain, and try to add some compile arguments which only support the ISA1.1.

@shushanhf
Copy link
Contributor Author

why the libatomic using mutex_lock to emulate the cas-128bit instruction is not correct here.

The one and only place that calls PalInterlockedCompareExchange128 requires the update to be atomic:

// Helper function for updating two adjacent pointers (which are aligned on a double pointer-sized boundary)
// atomically.

. If the 128-bit update is not atomic, it will result into intermittent crashes.
It should be possible to create a variant of the calling code for platforms that are missing atomic 128-bit update.

Hi, @jkotas
I have some question about the CAS-128bit when updating and loading the data.

cmp rax, [r11 + CurrentOffset]
jne 0f
jmp [r11 + CurrentOffset + 8]

Are these codes related with the CAS-128-bit data?
If yes, is there some problem if the 128-bit data are loaded by two instructions rather than a 128-bit load instruction? When finished the first 64-bit pointer loading, the CAS-128-bit update the whole 128-bit date, then the second loading will get a new value but now the old value.

@jkotas
Copy link
Member

jkotas commented Oct 25, 2024

Yes, this code is related to the 128-bit CAS update. If the code sees a new value of the first pointer, it needs to see the new value of the second pointer as well in subsequent read to maintain correctness.

@shushanhf
Copy link
Contributor Author

Yes, this code is related to the 128-bit CAS update. If the code sees a new value of the first pointer, it needs to see the new value of the second pointer as well in subsequent read to maintain correctness.

But now these code will hit the wrong value at the very low possibility.

@jkotas
Copy link
Member

jkotas commented Oct 28, 2024

Right, the changes in this PR do not maintain this invariant that will lead to intermittent crashes.

@shushanhf
Copy link
Contributor Author

Right, the changes in this PR do not maintain this invariant that will lead to intermittent crashes.

I agree with you.
I just list this potential error.
It's best to discuss the 128-bit loading by a new issue or PR.

@jkotas
Copy link
Member

jkotas commented Jan 8, 2025

Fix included in #111086

@jkotas jkotas closed this Jan 8, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Feb 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

arch-loongarch64 area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants