Add delayed preloading for arrays of records#32136
Add delayed preloading for arrays of records#32136dinahshi wants to merge 11 commits intorails:mainfrom
Conversation
|
Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @eileencodes (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review. Please see the contribution instructions for more information. |
|
This is awesome. We have an implementation of this at Shopify. This is the implementation. Maybe we should also support passing an relation? Also can you add documentation? |
|
We are using similar technique in our project. Sometimes loaded associations.Preloader currently determines if the association is preloaded or not by the first record https://github.com/rails/rails/blob/master/activerecord/lib/active_record/associations/preloader.rb#L180 OfftopicYou may see the implementation that is efficient for us: def preload_into_association(host_association, *preload_associations)
# Preloading records in such a weird way
# Because we want to cache them inside an association
# but not have a separated scope as a result of #preload call
preload_associations.each do |preload_association|
collection = send(host_association)
unpreloaded = collection.reject {|l| l.association(preload_association).loaded?}
ActiveRecord::Associations::Preloader.new.preload(unpreloaded, preload_association)
end
endHere is how it is used: Site.has_many :campaigns
Campaign.has_many :assets
site.preload_into_association(:campaigns, :assets)
# which is better than
site.campaigns.preload(:assets)
# because it generates new relation and site.campaigns.map(&:assets) is not cached...EVER... You may see that we filter out loaded records with loaded association from collection which resolves the original problem I described. The counter problem of that is that it doesn't allow to pass the nested associations... only an array of associations. |
|
@rafaelfranca I like the Shopify implementation's straightforward way of handling relations. I'm a little concerned that the method can return different types. When passed a relation, it returns a relation so it can be chained with other query methods like |
|
Addressed feedback from @rafaelfranca and @kaspth. aab3b41f3d360c56bcd8e72295863676d4b3ed7a fixes the issue raised in #32140 where it is assumed that an entire list is loaded when the first record in the list is loaded. 8d6e1468b6c126f7cfb96171e417d892f257bbdf fixes preloading different object ids. Added documentation and updated changelog. |
There was a problem hiding this comment.
Hi @dinahshi
These changes look good, but I think that they will conflict with some of the latest changes made in #31250.
The method preload_associations was defined by this PR so I think that this will cause conflicting behaviour.
In #31250, this method is doing the preloading of associations as part of the loading of the relation, so doesn't take any arguments for the different associations, but does accept the records as the preloading can be done on a subset of the records.
I think that this change would still be good without this alias, so maybe it could be removed?
Otherwise, maybe these methods could be moved to be class methods on ActiveRecord::Base as I am not sure if there is use case for using them on a relation (all the examples in the tests are directly on the ActiveRecord::Base subclasses), but there might be other use cases I am not aware of.
Last option would be to rename the other preload_associations method, easiest thing to do would be to prefix with _ as it is not really part of the public API.
There was a problem hiding this comment.
Thanks for pointing this out! I think it'd be nice to make preload_associations part of the public API. It doesn't seem like there's a precedent for prefixing private methods with _ so I'd like to rename the current method to be more specific—maybe preload_associations_on_records?
In terms of the method being defined on Base vs Relation, I wanted to keep the logic close to the original preload query method which is defined on Relation.
|
I'm super excited about this feature, thank you so much @dinahshi and @eileencodes (and everyone else who contributed). Just wanted to confirm, is this is targeting Rails 6? |
|
After a few months on the back burner, I believe this is ready for review! 😅 As suggested by @matthewd, I've removed the preload_associations alias to follow the precedent of "load" implying that an action occurs immediately, and renamed |
There was a problem hiding this comment.
Still think a load_association (singular) alias would be nice for the single association case.
There was a problem hiding this comment.
Should preload be linked here? Or does RDoc do it automatically via # and I just don't remember? 😄
There was a problem hiding this comment.
Other methods seem to have it linked automatically through # but I'm not sure!
There was a problem hiding this comment.
Could self.klass here just be klass?
There was a problem hiding this comment.
I think we'd just say
raise ArgumentError, "`load_associations` must be given…"There was a problem hiding this comment.
How likely are users to pass in nested arrays? And ones where nil is part of the array?
Does the preloader do the same work on the records or does it support nested arrays and nil values?
Since we mutate records how can users properly loop through them if they're either nested or contain nil? E.g. should we punt the flatten.compact to user code instead of baking it in here?
Come to think of it, I'm curious if we even need this defensive coding method at all?
There was a problem hiding this comment.
Preloader handles nil with a compact here. The docs say it should flatten nested arrays but now that I try it, I'm not seeing that behaviour. Will investigate further and see if either docs or code need to be changed.
I wanted to add this layer so that errors don't originate in Preloader, which isn't part of the public API. If we keep flatten, perhaps an addition to documentation (similar to the Preloader doc) would help explain the mutation to records.
There was a problem hiding this comment.
I checked on master and Preloader also doesn't flatten arrays there (i.e. passing a nested array into the records parameter causes an undefined method for Array error). I'll leave the nil check in, remove the flatten, and update the Preloader docs.
There was a problem hiding this comment.
Nice to know that we support polymorphic association loading as well!
I'm curious what happens if I keep the variables separate:
authors, posts = Author.all.to_a, Post.all.to_a
ActiveRecord::Base.load_associations(authors + posts, :comments)
authors.map(&:comments) # Will SQL queries fire here? Or are the comments loaded?There was a problem hiding this comment.
Comments are loaded, no SQL queries will fire.
There was a problem hiding this comment.
👍, would it be clearer if we kept the variables separate in the tests and do authors + posts?
|
I just realized that I'm not sure how to rework this, would #load_associations have to be defined on the Core module as well as the Querying module? Is this effort worthwhile given that the method could be called on any subclass (i.e. do we want to support cc/ @matthewd |
Well that's an impressively-confusing error message 😅 Yes, let's move it out of Relation and into one of the model modules. At a glance it seems like a pretty equally awkward fit for both Querying and Associations, so Core looks to be the way to go. It's just a move, so there's no added complexity -- while I wouldn't encourage people to call it that way, all class methods are accessible via the relation automatically. My thinking here is that
|
activerecord/lib/active_record/associations/preloader/association.rb
Outdated
Show resolved
Hide resolved
f26c4c0 to
2ca88a3
Compare
|
This PR has a lot of problems right now. Here is a brief list (some with failing tests):
My vision on fixing that is pretty blurry, but I can share these thoughts: Also, through associations loading should never assign records to source and through reflection when the special scope is applied (in terms of code - Failing test cases (pluggable to def test_load_associations_ignore_loaded_records
post1 = posts(:welcome)
post2 = posts(:thinking)
posts = [post1, post2]
comment_ids = post2.comments.to_a.pluck(:id)
Post.load_associations(posts, :comments)
assert_equal comment_ids, post2.comments.to_a.pluck(:id)
end
def test_load_associations_into_dirty_association_merges_targets_properly
post = posts(:welcome)
ids = post.comment_ids
post.comments << post.comments.first
Post.load_associations(posts, :comments)
assert_equal ids, post.comments.ids
end
require "models/department"
require "models/hotel"
require "models/chef"
require "models/cake_designer"
require "models/drink_designer"
def test_load_associations_doesnt_reset_loaded_through_associations
cake_designer = CakeDesigner.create!(chef: Chef.new)
drink_designer = DrinkDesigner.create!(chef: Chef.new)
chefs = [cake_designer.chef, drink_designer.chef]
department = Department.create!(chefs: chefs)
hotel = Hotel.create!(departments: [department])
Hotel.load_associations([hotel], :cake_designers, :drink_designers)
assert_equal chefs, hotel.chefs
assert_equal [cake_designer], hotel.cake_designers
assert_equal [drink_designer], hotel.drink_designers
end |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Part 1 of ? The `ActiveRecord::Associations::Preloader` class was passing around many of the same arguments to methods and was difficult to follow. This PR does a few things: 1) Move arguments for the `Preloader` to the initializer instead of `preload` 2) Merges the `preloaders_for_one` and `preloaders_for_hash` methods into one `build_preloaders` method that takes no arguments. This new method is responsible for looping through all the passed associations. 3) If there are nested associations passed, the class will build a new `Preloader` object for each `child` in `build_child_preloader`. 4) The above changes allowed for removing most of the arguments being passed down. 5) Implement a deprecated shim that preserves the original API. While this API is private and undocuented, we know that lots of applications are using the API. Generally we don't deprecate private APIs but due to it's wide usage we decided to maintain behavior here. Next steps: 1) Continue refactoring this class and shared classes as they make more sense. There's a lot of `flat_map`'s in here that feel overused. 2) Once the API is more reasonable revive older Preloader PRs (ie rails#32136) 3) Expose a usable public API so that apps can build their own preloaders when necessary (we do this a lot at GitHub) without interacting with private methods. Co-authored-by: Dinah Shi <[email protected]
|
Closing because the work is going to be done in effort to make preload public and we have a lot of conflicts. |
Summary
Adds method Relation#load_associations(arr, *args) which accepts an array
of records and args to preload. Useful for when a loaded list is filtered
in ruby to produce an Array and we want to preload the records in this Array.
Effectively exposes
Preloader.preload(records, associations, preload_scope = nil)without requiring people to know about a separate Preloader class.
Todo
cc/ @matthewd