GH-44062: [Dev][Archery][Integration] Reduce needless test matrix#44099
GH-44062: [Dev][Archery][Integration] Reduce needless test matrix#44099kou merged 8 commits intoapache:mainfrom
Conversation
If we enable C++, Java and Rust, we use the following patterns: | Producer | Consumer | |----------|----------| | C++ | C++ | | C++ | Java | | C++ | Rust | | Java | C++ | | Java | Java | | Java | Rust | | Rust | C++ | | Rust | Java | | Rust | Rust | In apache/arrow, the following patterns are redundant because they should be done in apache/arrow-rs: | Producer | Consumer | |----------|----------| | Rust | Rust | In apache/arror-rs, the following patterns are redundant because they should be done in apache/arrow: | Producer | Consumer | |----------|----------| | C++ | C++ | | C++ | Java | | Java | C++ | | Java | Java | Add `--target-language` option. We can specify target languages by this. (We can specify `--target-language` multiple times.) Here are expected usages: In apache/arrow: * `--target-language=cpp` * `--target-language=csharp` * `--target-language=go` * `--target-language=java` * `--target-language=js` In apache/arrow-rs * `--target-language=rust` Here is an example in apache/arrow-rs: T: Languages specified by `--target-language` * rust O: Languages not specified by `--target-language` * cpp * csharp * go * java * js * nanoarrow Used matrix: | Producer | Consumer | |----------|----------| | Rust | Rust | | Rust | C++ | | Rust | C# | | Rust | Go | | Rust | Java | | Rust | JS | | Rust | nanoarrow| | C++ | Rust | | C# | Rust | | Go | Rust | | Java | Rust | | JS | Rust | | nanoarrow| Rust |
|
|
|
Hmm. "Rust -> Rust" still exists...: |
ci/scripts/integration_arrow.sh
Outdated
| --with-java=$([ "$ARROW_INTEGRATION_JAVA" == "ON" ] && echo "1" || echo "0") \ | ||
| --target-language=java \ | ||
| --with-js=$([ "$ARROW_INTEGRATION_JS" == "ON" ] && echo "1" || echo "0") \ | ||
| --target-language=js \ |
There was a problem hiding this comment.
This file is used by apache/arrow-rs too.
So embedding --target-language here isn't a good approach.
|
+1 No "Rust -> Rust": |
### Rationale for this change #44099 included debug prints. ### What changes are included in this PR? Remove debug prints. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 38b0c79. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them. |
| @click.option('--target-languages', default='', | ||
| help=('Target languages in this integration tests'), | ||
| envvar="ARCHERY_INTEGRATION_TARGET_LANGUAGES") |
There was a problem hiding this comment.
Uh, can the help string be more explanatory? It's really not possible to understand what this does from the current wording.
There was a problem hiding this comment.
Sure. But is the help string a suitable location to explain it more with click API?
In general, --help output uses a simple one-line (or a few lines) per option.
There was a problem hiding this comment.
Then it can be explained in the command's help: https://click.palletsprojects.com/en/8.1.x/api/#click.Command
| run_c_data=False, tempdir=None, target_languages="", | ||
| **kwargs): | ||
| tempdir = tempdir or tempfile.mkdtemp(prefix='arrow-integration-') | ||
| print(["before", target_languages]) |
There was a problem hiding this comment.
You should remove those print calls
| **kwargs): | ||
| tempdir = tempdir or tempfile.mkdtemp(prefix='arrow-integration-') | ||
| print(["before", target_languages]) | ||
| target_languages = list(filter(len, target_languages.split(","))) |
There was a problem hiding this comment.
Because "".split(",") returns [""] not [].
(I surprised the Python behavior because "".split(",") returns [] with Ruby.)
There was a problem hiding this comment.
Why would you pass ,? It's ok to make it an error IMHO.
There was a problem hiding this comment.
Because --target-languages=cpp,java is an input. We need to extract cpp and java from cpp,java.
I know that click supports --target-language=cpp --target-languages=java style. But reusing ci/scripts/integration_arrow.sh in apache/arrow, apache/arrow-rs, apache/arrow-nanoarrow and apache/arrow-go is difficult with the style. Entirely overwritable --target-languages=cpp,java style is easy to reusable.
There was a problem hiding this comment.
Ah, you're talking about the empty string, sorry, I had misread.
Then I would suggest the much more idiomatic
| target_languages = list(filter(len, target_languages.split(","))) | |
| target_languages = target_languages.split(",") if target_languages else [] |
| @click.option('--with-rust', type=bool, default=False, | ||
| help='Include Rust in integration tests', | ||
| envvar="ARCHERY_INTEGRATION_WITH_RUST") | ||
| @click.option('--target-languages', default='', |
There was a problem hiding this comment.
"Languages" is not really right here, as nanoarrow for example is not a language. "Implementations" perhaps?
Rationale for this change
If we enable C++, Java and Rust, we use the following patterns:
In apache/arrow, the following patterns are redundant because they should be done in apache/arrow-rs:
In apache/arror-rs, the following patterns are redundant because they should be done in apache/arrow:
What changes are included in this PR?
Add
--target-languagesoption. We can specify target languages by this. Here are expected usages:In apache/arrow:
--target-languages=cpp,csharp,go,java,jsIn apache/arrow-rs
--target-languages=rustHere is an example in apache/arrow-rs:
Used matrix:
If no
--target-languagesis specified, all enabled languages are the target languages. (The same as the current behavior.)Are these changes tested?
Yes.
Are there any user-facing changes?
No.