Conversation
Remove atomic operations from the lhash statistics gathering, an inconsistent output was more than possible. Instead, use a lock. The critical sections are deliberately as small as possible. The rationale for the NULL checks is to allow a future change to where the statistics are optional.
|
I'm not convinced that the second commit is worthwhile. It attempts to avoid having I've no issue with ignoring the second and just including the first. |
richsalz
left a comment
There was a problem hiding this comment.
minor bit of cleverness to remove :)
crypto/lhash/lhash.c
Outdated
There was a problem hiding this comment.
unh maybe put those in a "if miss" statement?
There was a problem hiding this comment.
:(
An attempt to provide constant overhead.
I'll switch to a conditional statement instead.
|
One can wonder if this is adequate trade-off. I mean printing stats is essentially debugging thing, i.e. very few would actually do it, and question is why does caller have to be unconditionally punished with overhead. Well, it is punished even now, but it's presumably acceptable on contemporary systems, i.e. those with atomic_add. Switching to full-blown unconditional locking even on contemporary systems feels excessive. Yes, two mentions of "unconditional" means that best scenario would probably be if application had an option to not collect stats :-) One can also wonder if debugging aid has to be that accurate considering overhead costs. After all, OPENSSL_LH_stats_bio prints whole structure but only part of it is actually lock-protected. Which kind of asks for question how come just those? I mean consider OPENSSL_LH_insert. It calls getrn, which cares about atomicity of just some fields but not others or traversing list[!], while OPENSSL_LH_insert is not concerned at all about synchronization. What am I missing? |
|
Atomic |
Ah! It does expect caller to synchronize, but retrieve requires read lock, which makes it inappropriate to update statistics... [edit: well, not "retrieve requires read lock", but rather "read lock is sufficient and should be preferred for retrieve"] |
|
Yeah, the atomic operations in some instances but not in others confused me until I realised that writes to the The way I've done this allows the lock to be NULL and then there is zero overhead. My ideal would be for a As for cost, it is one lock vs potentially many atomic operations. It is impossible to tell which is more expensive abstractly. |
The whole purpose for requiring read lock is performance. Which will be totally defeated by write locking stats fields... For reference, a single data point, on my computer tight loop with increment of volatile variable is ~6 cycles per increment, atomic_add - ~20, lock-increment-unlock - ~110. |
But I can't see that it's mentioned in documentation... Am I missing something again? |
I don't follow... |
|
When accessing the table for write, there are a minimum of three atomic operations and possibly many more on collision. Read has minimum of four and possibly many more. The speed difference isn't looking great. I meant that the caller had to ensure that there was only ever a single thread attempting to write to the lhash (and that nothing was trying to read it at this time). There is no such restriction on reading the lhash. You are correct, there isn't any documentation. I also noticed that the POD for stack includes the internal functions as well as the type safe wrappers but the one for lhash only has the type safe wrappers declared. |
…lock is only ever grabbed once for any operation.
|
An alternative would be to document that the lhash isn't thread safe (which it isn't) and to remove the locks and the atomic operations entirely. |
|
Atomic operations initially put in by #3525 |
I was about to suggest that too! Though I'd say that since it's caller's responsibility to synchronize access to the structure, it would be appropriate to state that if caller wants reliable stats, then it should use exclusive write locks even with retrieve operations. And if reliable stats are not required, then shared read lock with retrieve operation would do. |
One should recognize that tight loop is not necessarily representative case. In real life, in presence of other meaningful instructions, execution time will so to say "dissolve", or get "amortized" if you wish. However, one can be pretty sure that "amortization" would be minimal in last case. Or in other words lock-increment-unlock would still be times slower than any other option. |
|
Should I still remove the locking in light of #4427 where they become optional and not used by default? |
That's what I for one would be in favour of. I.e. remove the locking and atomic increments [and put synchronization responsibility on caller], even for reliable stats. In this case #4427 is rendered inapplicable. There might be other opinions... One can still wonder following question. Would it be appropriate to bring locking into this module? I mean instead of relying on caller to do the locking, one can do locking in stack subroutines. This is probably inappropriate to do in minor release, but it doesn't mean that we can't discuss future plans. Argument against doing this might be that common usage pattern is such that caller is more than likely locking own structure that refers stack, so that second lock would be excessive... |
|
Adding locking isn't gong to be API breaking. There are lock free extensible hash tables in the literature. If the common operations can be made to only use a single atomic compare-and-set sequence, one might be worthwhile. |
|
This PR is unnecessary in light of #4429 |
|
I've taken the mutually agreeable path of not trying to lock in lhash. |
indicate the level of locking required for various operations. Remove the lock and atomics from the lhash code. These we're not complete or adequate. Refer to openssl#4418 and openssl#4427 for details.
indicate the level of locking required for various operations. Remove the lock and atomics from the lhash code. These we're not complete or adequate. Refer to #4418 and #4427 for details. Reviewed-by: Rich Salz <[email protected]> Reviewed-by: Ben Kaduk <[email protected]> Reviewed-by: Andy Polyakov <[email protected]> (Merged from #4429)
Remove atomic operations from the lhash statistics gathering, an inconsistent
output was more than possible.
Instead, use a lock. The critical sections are deliberately as small as
possible.
The rationale for the NULL checks is to allow a future change to where the
statistics are optional.