Skip finalization for PoolThreadCache instances without small/normal caches#13408
Skip finalization for PoolThreadCache instances without small/normal caches#13408
Conversation
…caches Motivation: PoolThreadCache doesn't need to be finalized if it doesn't contain small/normal caches Modifications: Create a base class for PoolThreadCache that won't be registered for finalization Result: Less GC activity and reduced footprint for short/medium living threads
|
This is the first step in the direction to make Netty more "loom-friendly" too: now that we can generate non-finalizable lightweight instances, we could find a way to associate them to a virtual thread without using any thread local (that is NOT a good idea to use with short living millions of virtual threads :"( ) |
|
@chrisvest PTAL. |
### _Why are the changes needed?_ Upgrade Netty to the latest Arrow-compatible version: 4.1.93.Final Currently, we can not upgrade to 4.1.94.Final or above because of netty/netty#13408 ``` java.lang.NoSuchMethodError: 'io.netty.buffer.PoolThreadCache io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.threadCache()' at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:164) at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:214) at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:58) at org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:77) at org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:84) at org.apache.arrow.memory.NettyAllocationManager$1.create(NettyAllocationManager.java:34) at org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:354) at org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:349) at org.apache.arrow.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:337) at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:315) at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:279) at org.apache.arrow.vector.BaseVariableWidthVector.allocateBytes(BaseVariableWidthVector.java:462) at org.apache.arrow.vector.BaseVariableWidthVector.allocateNew(BaseVariableWidthVector.java:420) at org.apache.arrow.vector.BaseVariableWidthVector.allocateNew(BaseVariableWidthVector.java:380) at org.apache.spark.sql.execution.arrow.ArrowWriter$.$anonfun$create$1(ArrowWriter.scala:42) ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #4992 from pan3793/netty. Closes #4992 9dd8f9a [Cheng Pan] nit e695314 [Cheng Pan] Bump Netty 4.1.93.Final Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
|
@chrisvest @franz1981 Ping, just in case you're not aware of the performance regression introduced by this commit. |
|
@slandelle thanks a lot... @ejona86 I still don't get why this would cause such a problem |
|
@normanmaurer given that this one was an "addition" but not required (for loom I will poet the allocator from Chris), we can revert and Investigate (without hurry). I don't want anyone to regress for something which is not a bug fix |
|
@franz1981 Could the regression be type pollution? I don't see anything suspicious in this other than two types visiting code paths where there previously was only one. |
|
Good point @chrisvest yep, it could be a funny (and scary, as usual :/ ) possibility... At the same time the 2 concrete types doesn't implement any interface...mmm |
|
@franz1981 It's not necessarily related to |
|
Instead of splitting out |
|
@chrisvest you mean profile type pollution then, yep. But with concrete types (not interfaces) bi-morphism should still allow inlining and save megamorphism. 5a7197f#diff-c24e989820523c812b36dea1eb81cbbde7822d060eac7bf136fa96be00956e3fR521 Which wasn't present before and it should be trivial to check if reverting just the condition fix it. And still I like your idea of an inner reference which can be finalized...so if you feel to do it, go for it +100 |
…/normal caches (netty#13408)" This reverts commit 9412424
Motivation: PoolThreadCache objects are created even when they are not meant to pool anything. In such a case, there is no point in giving them a finalizer() method. Modification: The finalizer method is moved to a separate object, which is conditionally referenced from the PoolThreadCache. Non-pooling PoolThreadCache objects will not create their FreeOnFinalize objects. This is an alternative implementation to netty#13408, which is also reverted by this PR. Additionally, netty#13408 added a condition to create non-caching PoolThreadCache when the size of the small and normal caches were zero. However, it turned out that even when these were requested to be zero, a single-element cache would be created for them. This PR also reverts the logic to the old behaviour with the single-element cache for small and normal sizes. Result: We avoid creating finalizable objects when we don't pool.
Motivation: PoolThreadCache objects are created even when they are not meant to pool anything. In such a case, there is no point in giving them a finalizer() method. Modification: The finalizer method is moved to a separate object, which is conditionally referenced from the PoolThreadCache. Non-pooling PoolThreadCache objects will not create their FreeOnFinalize objects. This is an alternative implementation to #13408, which is also reverted by this PR. Additionally, #13408 added a condition to create non-caching PoolThreadCache when the size of the small and normal caches were zero. However, it turned out that even when these were requested to be zero, a single-element cache would be created for them. This PR also reverts the logic to the old behaviour with the single-element cache for small and normal sizes. Result: We avoid creating finalizable objects when we don't pool.
|
@franz1981 Apparently the extra condition check is not the culprit: grpc/grpc-java#10401 (comment) So the most convenient explanation is ruled out. |
Motivation:
PoolThreadCache doesn't need to be finalized if it doesn't contain small/normal caches
Modifications:
Create a base class for PoolThreadCache that won't be registered for finalization
Result:
Less GC activity and reduced footprint for short/medium living threads