Skip to content

ARROW-14659: [R] Remove warning about factor conversion to string in if_else()#11794

Closed
stephhazlitt wants to merge 5 commits intoapache:masterfrom
stephhazlitt:ARROW-14659-rm-warning
Closed

ARROW-14659: [R] Remove warning about factor conversion to string in if_else()#11794
stephhazlitt wants to merge 5 commits intoapache:masterfrom
stephhazlitt:ARROW-14659-rm-warning

Conversation

@stephhazlitt
Copy link
Contributor

@stephhazlitt stephhazlitt commented Nov 29, 2021

This is my first PR contributing (or an attempt to contribute) to {arrow}.

This PR:

• removes warn_types warning that factors are converted to strings, which is no longer true https://github.com/apache/arrow/blob/master/r/R/dplyr-functions.R#L911-L920
• updates the test by removing the warning https://github.com/apache/arrow/blob/master/r/tests/testthat/test-dplyr-funcs-conditional.R#L130 however it does not remove the mutate() in the test as suggested in the TODO, if removed the test fails?
• [UPDATE] test includes a reset of the levels of all factor columns to pass, since Arrow if_else() kernel does not preserve unused factor levels (ARROW-14649)

@github-actions
Copy link

@nealrichardson
Copy link
Member

if removed the test fails?

Fails how?

@stephhazlitt
Copy link
Contributor Author

With the mutate line removed (line 128)

e.g.

 compare_dplyr_binding(
    .input %>%
      mutate(
        y = if_else(int > 5, fct, factor("a"))
      ) %>%
      collect(),
    tbl
  )
── Failure (test-dplyr-funcs-conditional.R:119:3): if_else and ifelse ───────────────────────────
`object` (`actual`) not equal to `expected` (`expected`).

`levels(actual$y)[1:4]`:   "a"             "g" "h" "i"
`levels(expected$y)[1:7]`: "a" "b" "c" "d" "g" "h" "i"

  `actual$y[4:10]`: NA 1 NA 2 3 4 5
`expected$y[4:10]`: NA 1 NA 5 6 7 8
Backtrace:
 1. compare_dplyr_binding(...) test-dplyr-funcs-conditional.R:119:2
 2. expect_equal(via_table, expected, ...) helper-expectation.R:129:4
 3. testthat::expect_equal(...) helper-expectation.R:42:4```

@ianmcook
Copy link
Member

@stephhazlitt
Copy link
Contributor Author

Thanks @ianmcook, I replaced the as.character mutate with your approach above and added a comment to reference ARROW-14649. And apologies, I should have added a co-author to that commit.

@ianmcook ianmcook closed this in b83e6b0 Nov 29, 2021
@ursabot
Copy link

ursabot commented Nov 29, 2021

Benchmark runs are scheduled for baseline = 4913352 and contender = b83e6b0. b83e6b0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed] ursa-i9-9960x
[Finished ⬇️0.8% ⬆️0.13%] ursa-thinkcentre-m75q
Supported benchmarks:
ursa-i9-9960x: langs = Python, R, JavaScript
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants