OBPIH-7515 Improve performance of stock transfer candidates API#5612
OBPIH-7515 Improve performance of stock transfer candidates API#5612kchelstowski merged 1 commit intorelease/0.9.6from
Conversation
|
For now it targets release branch, but the decision on whether we get it into the release will be made on Wednesday. |
| createAlias("orderItems.shipmentItems", "shipmentItems", JoinType.LEFT_OUTER_JOIN) | ||
| eq("origin", location) | ||
| eq("orderType", OrderType.findByCode(OrderTypeCode.TRANSFER_ORDER.name())) | ||
| } |
There was a problem hiding this comment.
The issue was with N+1 queries. I could reduce them from about 19500 to 2500 (I couldn't get any lower due to the mapping issue that we know about, when having e.g. 1:1 association, it always gets fetched one by one, so e.g. for picklist I couldn't make that be pre-fetched - the same with parentOrderItem.
There was a problem hiding this comment.
can you explain this a bit more? Why the need to create these aliases for fields that we're not selecting by? Is it because we loop through the Orders in the lines below and so the optimization is to pre-fetch the relationship entities to avoid the N+1?
Is this different from doing fetchMode("fieldName", FetchMode.JOIN)?
There was a problem hiding this comment.
Also I don't know if you've had time to look into it at all but how hard do you think it'd be to switch the relationships to be managed manually by us instead of via GORM hasMany mapping? Is it simply a matter of defining our own getX/setX and addTo/removeFromX methods or is it more complex?
There was a problem hiding this comment.
I will keep this in mind if/when we refactor the picklist as part of our "allocation" refactoring. The picklist association on Requisition and Order is probably unnecessary and we might be able to remove the association or replace it with a transient getter in the short-term i.e. it becomes a lazy N+1 query but only when it's being accessed. But in general, Picklist should have a reference back to the Order or Requisition it came from but we might not need a hasOne to picklist. In fact, in the future we might need to support a hasMany association (we might want to support waving multiple items of an order at a time).
@awalkowiak @ewaterman Maybe we can talk about this on a tech huddle once we get a data model designed for the allocation work we're doing in the other project.
It might also be time to split out putaways and internal transfers into their own domains to avoid overusing the Order domain for everything (this is likely causing additional slowness around all of the Order queries). In fact, if we go that direction, I think we might as well migrate away from Order altogether in favor of specific domains for each order type i.e. PurchaseOrder, TransferOrder, ReturnOrder, InternalOrder, PutawayOrder, ProductionOrder, WorkOrder, etc. We can start by creating three separate classes for the different movements InboundOrder, OutboundOrder and InternalOrder and then cross the migration bridge as we need to.
There was a problem hiding this comment.
can you explain this a bit more? Why the need to create these aliases for fields that we're not selecting by? Is it because we loop through the Orders in the lines below and so the optimization is to pre-fetch the relationship entities to avoid the N+1?
Yes exactly, it's to avoid N+1 further. Basically, I joined all the properties (tables) that are further accessed and would cause SELECTs to be called for each association separately.
As for fetchMode I think this should also work, but I remember we had some problems with that back in OBGM, that in some cases it was probably not working. I don't remember exactly, but I know something forces me not to use it. If you are curious I could dig it up.
| item.originBinLocation?.id == it.originBinLocation?.id && | ||
| item.product?.id == it.product?.id | ||
| } | ||
| groupedPendingItems.containsKey([item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id]) |
There was a problem hiding this comment.
this one was also a game changer, as now we have O(1) find, vs O(n^2) we had when calling .find in a loop.
Actually, this change improved the performance more than the N+1 problem.
There was a problem hiding this comment.
Nice work @kchelstowski
I like this pattern (creating an index for lookups), and there are likely other places we could apply it.
That last statement can't possibly be true though. We're talking about iterating over a collection in-memory vs executing multiple queries per item. The only way that could be true is if the property accessors in lines 118 - 122 were causing database queries (N+1) themselves?
There was a problem hiding this comment.
I don't know if it would actually impact performance much so this probably not worth doing, but you don't actually use the map values so you could change this to just a set of keys:
Set<List<String>> keys = pendingStockTransferItems.collect { [...] }
stockTransferItems.removeAll { StockTransferItem item ->
keys.contains([item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id])
Or if you wanted to keep it as a map you can use a MultiKeyMap
MultiKeyMap<String> map = pendingStockTransferItems.collectEntries { new MultiKey(...), it }
stockTransferItems.removeAll { StockTransferItem item ->
map.containsKey(item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id)
but that's probably overkill
There was a problem hiding this comment.
That last statement can't possibly be true though. We're talking about iterating over a collection in-memory vs executing multiple queries per item. The only way that could be true is if the property accessors in lines 118 - 122 were causing database queries (N+1) themselves?
@jmiranda to your surprise, it's the in-memory operation that slowed down the whole feature. List has O(n) read access, so if you access List in a loop, you get O(n^2) whereas for Map you have O(1) read access.
Even with JOINs applied (so that this loop doesn't call the DB anymore at all), it's still very slow before you use Map for that.
I have already fixed that pattern (calling .find in the List in the loop) in a few places, like ProductAvailabilityService or in Putaway, where it improved the performance much.
If you are curious or you would like to remind yourself what that was, let me know, so I'll send the PRs/tickets.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/0.9.6 #5612 +/- ##
===============================================
Coverage ? 8.52%
Complexity ? 1127
===============================================
Files ? 712
Lines ? 45635
Branches ? 10911
===============================================
Hits ? 3890
Misses ? 41166
Partials ? 579 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| List<StockTransferItem> pendingStockTransferItems = getPendingItems(location) | ||
|
|
||
| Map<List<String>, List<StockTransferItem>> groupedPendingItems = pendingStockTransferItems.groupBy { StockTransferItem item -> | ||
| [item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id] |
There was a problem hiding this comment.
Does this work ok when originBinLocation is null? Or should we need to handle that with a special key i.e. NOLOC or DEFAULT?
There was a problem hiding this comment.
yeah, I checked that "manually" and it handles null as part of a key fine.
| item.originBinLocation?.id == it.originBinLocation?.id && | ||
| item.product?.id == it.product?.id | ||
| } | ||
| groupedPendingItems.containsKey([item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id]) |
There was a problem hiding this comment.
Nice work @kchelstowski
I like this pattern (creating an index for lookups), and there are likely other places we could apply it.
That last statement can't possibly be true though. We're talking about iterating over a collection in-memory vs executing multiple queries per item. The only way that could be true is if the property accessors in lines 118 - 122 were causing database queries (N+1) themselves?
| createAlias("orderItems.shipmentItems", "shipmentItems", JoinType.LEFT_OUTER_JOIN) | ||
| eq("origin", location) | ||
| eq("orderType", OrderType.findByCode(OrderTypeCode.TRANSFER_ORDER.name())) | ||
| } |
There was a problem hiding this comment.
I will keep this in mind if/when we refactor the picklist as part of our "allocation" refactoring. The picklist association on Requisition and Order is probably unnecessary and we might be able to remove the association or replace it with a transient getter in the short-term i.e. it becomes a lazy N+1 query but only when it's being accessed. But in general, Picklist should have a reference back to the Order or Requisition it came from but we might not need a hasOne to picklist. In fact, in the future we might need to support a hasMany association (we might want to support waving multiple items of an order at a time).
@awalkowiak @ewaterman Maybe we can talk about this on a tech huddle once we get a data model designed for the allocation work we're doing in the other project.
It might also be time to split out putaways and internal transfers into their own domains to avoid overusing the Order domain for everything (this is likely causing additional slowness around all of the Order queries). In fact, if we go that direction, I think we might as well migrate away from Order altogether in favor of specific domains for each order type i.e. PurchaseOrder, TransferOrder, ReturnOrder, InternalOrder, PutawayOrder, ProductionOrder, WorkOrder, etc. We can start by creating three separate classes for the different movements InboundOrder, OutboundOrder and InternalOrder and then cross the migration bridge as we need to.
| List<StockTransferItem> getPendingItems(Location location) { | ||
| List<Order> orders = Order.findAllByOriginAndOrderType(location, OrderType.findByCode(OrderTypeCode.TRANSFER_ORDER.name())) | ||
| List<Order> orders = Order.createCriteria().list { | ||
| setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY) |
There was a problem hiding this comment.
In the other class you do Criteria.DISTINCT_ROOT_ENTITY. Is there a difference?
| createAlias("orderItems.shipmentItems", "shipmentItems", JoinType.LEFT_OUTER_JOIN) | ||
| eq("origin", location) | ||
| eq("orderType", OrderType.findByCode(OrderTypeCode.TRANSFER_ORDER.name())) | ||
| } |
There was a problem hiding this comment.
can you explain this a bit more? Why the need to create these aliases for fields that we're not selecting by? Is it because we loop through the Orders in the lines below and so the optimization is to pre-fetch the relationship entities to avoid the N+1?
Is this different from doing fetchMode("fieldName", FetchMode.JOIN)?
| createAlias("orderItems.shipmentItems", "shipmentItems", JoinType.LEFT_OUTER_JOIN) | ||
| eq("origin", location) | ||
| eq("orderType", OrderType.findByCode(OrderTypeCode.TRANSFER_ORDER.name())) | ||
| } |
There was a problem hiding this comment.
Also I don't know if you've had time to look into it at all but how hard do you think it'd be to switch the relationships to be managed manually by us instead of via GORM hasMany mapping? Is it simply a matter of defining our own getX/setX and addTo/removeFromX methods or is it more complex?
| item.originBinLocation?.id == it.originBinLocation?.id && | ||
| item.product?.id == it.product?.id | ||
| } | ||
| groupedPendingItems.containsKey([item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id]) |
There was a problem hiding this comment.
I don't know if it would actually impact performance much so this probably not worth doing, but you don't actually use the map values so you could change this to just a set of keys:
Set<List<String>> keys = pendingStockTransferItems.collect { [...] }
stockTransferItems.removeAll { StockTransferItem item ->
keys.contains([item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id])
Or if you wanted to keep it as a map you can use a MultiKeyMap
MultiKeyMap<String> map = pendingStockTransferItems.collectEntries { new MultiKey(...), it }
stockTransferItems.removeAll { StockTransferItem item ->
map.containsKey(item.location?.id, item.inventoryItem?.id, item.originBinLocation?.id, item.product?.id)
but that's probably overkill

✨ Description of Change
Link to GitHub issue or Jira ticket:
Description:
📷 Screenshots & Recordings (optional)