Conversation
| actual.trim().to_string() | ||
| } | ||
|
|
||
| pub fn batches_to_sort_string(batches: &[RecordBatch]) -> String { |
There was a problem hiding this comment.
there's alternative way: use https://docs.rs/insta/latest/insta/struct.Settings.html#method.sort_maps
this, however, will require switching to serializable format - which we dont want - so leaving as is
There was a problem hiding this comment.
I think using strings for comparison works well - let's try this
| @@ -429,16 +434,16 @@ async fn drop_with_quotes() -> Result<()> { | |||
|
|
|||
| let df_results = df.collect().await?; | |||
|
|
|||
| assert_batches_sorted_eq!( | |||
There was a problem hiding this comment.
(here and everywhere else) batches_to_sort_string order has changed changed because rhs is order-sensitive (since it's a string). there's a way to overcome this (https://github.com/apache/datafusion/pull/15165/files#r1990156623) but IMO as-is is better
There was a problem hiding this comment.
Something @crepererum did in influxdb_iox was to create a new that wraps the results so the tests look like this:
Note sure if that is 100% relevant but it is a neat pattern
There was a problem hiding this comment.
Why don't we first sort the batches and then convert to strings? so the orders are preserved
There was a problem hiding this comment.
I think we do sort them before converting to strings. Or maybe I missed your point?
There was a problem hiding this comment.
We are sorting strings, not rows of the batches?
There was a problem hiding this comment.
ah, true. that can be better - but i think the order is preserved even now
There was a problem hiding this comment.
I realized now assert_batches_sorted_eq() does also the same. I'd just like to avoid the strange ordering like
+------+
| f"c2 |
+------+
| 11 |
| 2 |
+------+
even if we say batches_to_sort_string
There was a problem hiding this comment.
that is very very fair! initially i wanted to sort by all columns present in the df - but the issue is that not all types are sortable
I also considered sorting by something like https://docs.rs/human-sort/latest/human_sort/ - but then again, it's +1 extra dev dependency...
alamb
left a comment
There was a problem hiding this comment.
Thank you @blaginin -- I think this looks great! Though I think we should leave this open for a while to gather more feedback before we merge it
For anyone following along, to update these results what you do is run cargo insta review
@blaginin maybe we can write up / make a screen cast showing the update process?
FYI @jayzhan211 @berkaysynnada @Omega359 . What do you think?
| actual.trim().to_string() | ||
| } | ||
|
|
||
| pub fn batches_to_sort_string(batches: &[RecordBatch]) -> String { |
There was a problem hiding this comment.
I think using strings for comparison works well - let's try this
| @@ -429,16 +434,16 @@ async fn drop_with_quotes() -> Result<()> { | |||
|
|
|||
| let df_results = df.collect().await?; | |||
|
|
|||
| assert_batches_sorted_eq!( | |||
There was a problem hiding this comment.
Something @crepererum did in influxdb_iox was to create a new that wraps the results so the tests look like this:
Note sure if that is 100% relevant but it is a neat pattern
berkaysynnada
left a comment
There was a problem hiding this comment.
Tests seem much better now, thank you @blaginin. I left 2 suggestions.
BTW, do you have any script etc. to make auto-conversions? or we have to update all others by hand?
|
|
||
| // Get string representation of the plan | ||
| async fn assert_physical_plan(df: &DataFrame, expected: Vec<&str>) { | ||
| async fn physical_plan_to_string(df: &DataFrame) -> String { |
There was a problem hiding this comment.
This name sounds like (Arc<dyn ExecutionPlan>) -> (String). Perhaps we can make this a method of Dataframe, and call it something like "to_physical_plan_string". WDYT?
| @@ -429,16 +434,16 @@ async fn drop_with_quotes() -> Result<()> { | |||
|
|
|||
| let df_results = df.collect().await?; | |||
|
|
|||
| assert_batches_sorted_eq!( | |||
There was a problem hiding this comment.
Why don't we first sort the batches and then convert to strings? so the orders are preserved
|
Created an epic for subsequent changes, wrote my approach and automation ideas there |
|
I think we also need to update tests in |
|
Great, I saw #15178 |
|
I think this is a definite improvement! |
|
I updated this PR to say it closes this ticket. Let me know if that isn't right |
|
I merged this branch up from main to rerun the tests. Once CI passes I'll plan to merge this in |
|
🚀 |
|
Thanks again @blaginin Do you plan to organize some more migration as part of It will likely be a good exercise to learn how to get projects tee'd up for the community. I recommend making one or two tickets focused on a particular set of tests and see if others can follow the pattern. Once it is clear they can, then we can apply the copy/paste method to create a bunch of good first issue tickets for the other files that need porting This is going to be so good |
|
Thank you!! Yes, absolutely, i’m going to create a few good first issues now |



Which issue does this PR close?
instafor tests) #15178Rationale for this change
Switch to insta snapshots so review and update will be easier
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
No