Update Arrow to apache-arrow-4.0.1 by Avogar · Pull Request #6 · ClickHouse/arrow

Avogar · 2021-06-01T08:17:04Z

No description provided.

# Rationale Accessing the list of tables via `select * from information_schema.tables` (introduced in apache#9818) is a lot to type See the doc for background: https://docs.google.com/document/d/12cpZUSNPqVH9Z0BBx6O8REu7TFqL-NPPAYCUPpDls1k/edit# # Proposal Add support for `SHOW TABLES` command. # Commentary This is different than both postgres (which uses `\d` in `psql` for this purpose), and MySQL (which uses `DESCRIBE`). I could be convinced that we should not add `SHOW TABLES` at all (and just stay with `select * from information_schema.tables` but I wanted to add the proposal) # Example Use Setup: ``` echo "1,Foo,44.9" > /tmp/table.csv echo "2,Bar,22.1" >> /tmp/table.csv cargo run --bin datafusion-cli ``` Then run : ``` CREATE EXTERNAL TABLE t(a int, b varchar, c float) STORED AS CSV LOCATION '/tmp/table.csv'; > show tables; +---------------+--------------------+------------+------------+ | table_catalog | table_schema | table_name | table_type | +---------------+--------------------+------------+------------+ | datafusion | public | t | BASE TABLE | | datafusion | information_schema | tables | VIEW | +---------------+--------------------+------------+------------+ 2 row in set. Query took 0 seconds. ``` Closes apache#9847 from alamb/alamb/really_show_tables Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…support to GROUP BY/hash aggregates This PR adds support for TimestampMillisecondArray to `Scalar`, `GroupByScalar`, and hash_aggregate / GROUP BYs to Rust DataFusion, thus fixing 12028. I believe it might also fix 11940, though this needs input from you all. There are some formatting changes from running `cargo fmt`. Closes apache#9773 from velvia/evan/pr-arrow-12028-groupby-tsmillis Authored-by: Evan Chan <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

The `append` functions in the `Builder` structs are often used in "hot" code. This PR tags them with `#[inline]`, making it possible to inline the function calls across crate boundaries. Closes apache#9860 from ritchie46/inline_builder_appends Authored-by: Ritchie Vink <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…ld and dump the output. This adds a one hour timeout to the R builds. It changes the R CI test script to use `reporter="location"` by default. It adds a dump test logs step to the end of the build that will always dump the test output regardless of success/failure. These three changes combined will make it much easier to debug test failures in R tests. Closes apache#9846 from westonpace/feature/arrow-12143 Lead-authored-by: Weston Pace <[email protected]> Co-authored-by: Jonathan Keane <[email protected]> Signed-off-by: David Li <[email protected]>

ConvertedType::NA corresponds to an invalid converted type that was once added to the Parquet spec: apache/parquet-format#45 but then quickly removed in favour of the Null logical type: apache/parquet-format#51 Unfortunately, Parquet C++ could still in some cases emit the unofficial converted type. Also remove the confusingly-named LogicalType::Unknown, while "UNKNOWN" in the Thrift specification points to LogicalType::Null. Closes apache#9863 from pitrou/PARQUET-1990-invalid-converted-type Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

* Add ZSTD codec implementation. * Removes dependency on netty by using ArrowBuf methods. * Updates docs and archery test (will wait on CI to run archery if that fails will do more debugging.) Closes apache#9822 from emkornfield/zstd Lead-authored-by: emkornfield <[email protected]> Co-authored-by: Micah Kornfield <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

Closes apache#9856 from kou/glib-gandiva-filter Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

Clean up some things that `clippy` was complaining to me locally about Closes apache#9867 from alamb/alamb/rust-clippy Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…hema.columns` Builds on the code in apache#9818 # Rationale Provide schema metadata access (so a user can see what columns exist and their type). See the doc for background: https://docs.google.com/document/d/12cpZUSNPqVH9Z0BBx6O8REu7TFqL-NPPAYCUPpDls1k/edit# I plan to add support for `SHOW COLUMNS` possibly as a follow on PR (though I have found out that `SHOW COLUMNS` and `SHOW TABLES` are not supported by either MySQL or by Postgres 🤔 ) # Changes I chose to add the first 15 columns from `information_schema.columns` You can see the full list in Postgres [here](https://www.postgresql.org/docs/9.5/infoschema-columns.html) and SQL Server [here](https://docs.microsoft.com/en-us/sql/relational-databases/system-information-schema-views/columns-transact-sql?view=sql-server-ver15). There are a bunch more columns that say "Applies to features not available in PostgreSQL" and that don't apply to DataFusion either-- since my usecase is to get the basic schema information out I chose not to add a bunch of columns that are always null at this time. I feel the use of column builders here is somewhat awkward (as it requires many calls to `unwrap`). I am thinking of a follow on PR to refactor this code to use `Vec<String>` and `Vec<u64>` and then create `StringArray` and `UInt64Array` directly from them but for now I just want the functionality # Example use Setup: ``` echo "1,Foo,44.9" > /tmp/table.csv echo "2,Bar,22.1" >> /tmp/table.csv cargo run --bin datafusion-cli ``` Then run : ``` > CREATE EXTERNAL TABLE t(a int, b varchar, c float) STORED AS CSV LOCATION '/tmp/table.csv'; 0 rows in set. Query took 0 seconds. > select table_name, column_name, ordinal_position, is_nullable, data_type from information_schema.columns; +------------+-------------+------------------+-------------+-----------+ | table_name | column_name | ordinal_position | is_nullable | data_type | +------------+-------------+------------------+-------------+-----------+ | t | a | 0 | NO | Int32 | | t | b | 1 | NO | Utf8 | | t | c | 2 | NO | Float32 | +------------+-------------+------------------+-------------+-----------+ 3 row in set. Query took 0 seconds. ``` Closes apache#9840 from alamb/alamn/information_schema_columns Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

Add an `into_inner()` method to `ipc::writer::StreamWriter`, allowing users to recover the underlying writer, consuming the StreamWriter. Essentially exposes `into_inner()` from the BufWriter contained in the StreamWriter. The StreamWriter will 'finish' itself if not already finished when returning the writer. Also added `ArrowError::From<std::io::IntoInnerError>` conversion to allow for ergonomic return of the potential `IntoInnerError` returned by `BufWriter.into_inner()`. Closes apache#9858 from ericwburden/into-inner-fn-for-stream-writer Authored-by: Eric Burden <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…ex groups from strings Adds a regexp_extract compute kernel to select a substring based on a regular expression. Some things I did that I may be doing wrong: * I exposed `GenericStringBuilder` * I build the resulting Array using a builder - this looks quite different from e.g. the substring kernel. Should I change it accordingly, e.g. because of performance considerations? * In order to apply the new function in datafusion, I did not see a better solution than to handle the pattern string as `StringArray` and take the first record to compile the regex pattern from it and apply it to all values. Is there a way to define that an argument has to be a literal/scalar and cannot be filled by e.g. another column? I consider my current implementation quite error prone and would like to make this a bit more robust. Closes apache#9428 from sweb/ARROW-10354/regexp_extract Authored-by: Florian Müller <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…eads Closes apache#9808 from westonpace/feature/arrow-12097 Authored-by: Weston Pace <[email protected]> Signed-off-by: David Li <[email protected]>

According to PEP 632, distutils will be deprecated in Python 3.10 and removed in 3.12. * switch to `setuptools` for general packaging * use the `Version` class from the `packaging` project instead of `distutils.LooseVersion` Closes apache#9849 from pitrou/ARROW-12068-remove-distutils Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

Compressed files such as `.gz` can contain multiple concatenated "streams". If the last stream in the file decompressed to empty data, we would erroneously raise an error. Closes apache#9864 from pitrou/ARROW-12169-compressed-empty-stream Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: David Li <[email protected]>

…e on struct/classes ![docs](https://user-images.githubusercontent.com/1696093/111465419-2b1e8880-86c6-11eb-98d5-5e5b873c224c.png) Closes apache#9739 from westonpace/feature/arrow-12000 Authored-by: Weston Pace <[email protected]> Signed-off-by: David Li <[email protected]>

It's for GNU Autotools. Closes apache#9869 from kou/glib-remove-config Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

This PR adds child data to Arrow's C FFI implementation and implements it for `List` and `LargeList` datatypes. Closes apache#9778 from ritchie46/ffi_types Authored-by: Ritchie Vink <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

…run synchronously from datasets Calling the async streaming CSV reader from the synchronous Scanner::Scan was causing a form of nested parallelism and causing nested deadlocks. This commit brings over some of the work in ARROW-7001 and allows the CSV scan task to be called in an async fashion. In addition, an async path is put in the scanner and dataset write so that all internal uses of ScanTask()->Execute happen in an async-friendly way. External uses of ScanTask()->Execute should already be outside the CPU thread pool and should not cause deadlock. Some of this PR will be obsoleted by ARROW-7001 but the work in file_csv and the test cases should remain fairly intact. Closes apache#9868 from westonpace/bugfix/arrow-12161 Lead-authored-by: Weston Pace <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: David Li <[email protected]>

… integration tests # Rationale Rust debug symbols are quite verbose, taking up memory during the final link time as well as significant disk space. Turning off the creation of symbols should save us compile / test time for CI as well as space on the integration test # Change Do not produce debug symbols on Rust CI (keep enough to have line numbers in `panic!` traceback, but not enough to interpret a core file, which no one does to my knowledge anyways) Note that the integration test passed: https://github.com/apache/arrow/pull/9879/checks?check_run_id=2256148363 Closes apache#9879 from alamb/less_symbols_in_integration Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

This updates zstd version used by parquet crate to zstd = "0.7.0+zstd.1.4.9". Closes apache#9881 from aldanor/feature/zstd-0.7 Authored-by: Ivan Smirnov <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

This just moves the tests to allow the feature-flag to be used and pass this kind of test (where previously it would fail) ```bash cargo test --no-default-features --features cli ``` Closes apache#9874 from seddonm1/regexp_match_test Authored-by: Mike Seddon <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

Closes apache#9763 from emkornfield/trivial_prs Lead-authored-by: Micah Kornfield <[email protected]> Co-authored-by: emkornfield <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

This depends on ARROW-12192: apache/arrow-site#99 Closes apache#9885 from kou/release-post-website-download Lead-authored-by: Sutou Kouhei <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

This throws an error if the user attempts to create a Table with columns of different lengths. We already had this for RecordBatches but not for Tables. Closes apache#9851 from ianmcook/ARROW-12155 Authored-by: Ian Cook <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

@wqc200

…r functions and aggregates Broken out from apache#9600 by @wqc200. Note this does not contain the part of apache#9600 that controls the output display of functions. # Rationale Aggregate functions are checked using case insensitive comparison (e.g. `select MAX(x)` and `select max(x)` both work - the code is [here](https://github.com/apache/arrow/blob/356c300c5ee1e2b23a83652514af11e3a731d596/rust/datafusion/src/physical_plan/aggregates.rs#L75) However, scalar functions, user defined aggregates, and user defined functions, are checked using case sensitive comparisons (e.g. `select sqrt(x)` works while `select SQRT` does not. Postgres always uses case insensitive comparison: ``` alamb=# select sqrt(x) from foo; sqrt ------ (0 rows) alamb=# select SQRT(x) from foo; sqrt ------ (0 rows) ``` # Changes Always use case insensitive comparisons for unquoted identifier comparison, both for consistency within DataFusion as well as consistency with Postgres (and the SQL standard) Adds tests that demonstrate the behavior # Notes This PR changes how user defined functions are resolved in SQL queries. If a user registers two functions with names `"my_sqrt"` and `"MY_SQRT"` previously they could both be called individually. After this PR `my_sqrt` will be called unless the user specifically put `"SQRT"` (in quotes) in their query. Closes apache#9827 from alamb/case_insensitive_functions Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

There are two typos: 1. In comment: `*memory consumption.` -> `* memory consumption`. 2. reader_writer.h and reader-writer.cc are not compatible, when you use vim to edit them, it's confuse why `:A` doesn't work. Closes apache#9870 from Clcanny/fix-reader-writer-example-typo Authored-by: Clcanny <[email protected]> Signed-off-by: Yibo Cai <[email protected]>

Also `stringr::str_replace()` and `stringr::str_replace_all()` Closes apache#9878 from ianmcook/ARROW-11513 Lead-authored-by: Ian Cook <[email protected]> Co-authored-by: Neal Richardson <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

This adds `quantile()` and `median()` methods for `ArrowDatum` Closes apache#9875 from ianmcook/ARROW-11338 Authored-by: Ian Cook <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

…etend version in the macOS wheel builds Submitted crossbow job manually with target version 4.0.0: https://github.com/ursacomputing/crossbow/tree/build-124-github-wheel-osx-high-sierra-cp36m Closes apache#9872 from kszucs/ARROW-12172 Authored-by: Krisztián Szűcs <[email protected]> Signed-off-by: Krisztián Szűcs <[email protected]>

I was debugging another issue (not a bug in DataFusion I don't think) but noticed there wasn't any coverage for LIMIT in exec.rs, so I figured I would add some. (well really I was writing a test to trigger what I thought was a bug in DataFusion -- lol) Closes apache#9897 from alamb/limit_fix Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

Closes apache#10143 from jonkeane/ARROW-12520 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

Fixes false positives in a check that pkg-config is installed Closes apache#10198 from ianmcook/ARROW-12601 Authored-by: Ian Cook <[email protected]> Signed-off-by: Ian Cook <[email protected]>

Minimal changes to link to ucrt builds specifically once supported (and will be safely redundant until then). Closes apache#10217 from jeroen/winucrt Lead-authored-by: Jeroen Ooms <[email protected]> Co-authored-by: Neal Richardson <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

An uninitialized StopToken caused segfaults if you ever called read_csv with cancellation disabled or when not on the main thread (e.g. if used in a Flight server). If we have a 4.0.1 I think this qualifies as a regression. Closes apache#10227 from lidavidm/arrow-12622 Authored-by: David Li <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

…t.write_table Closes apache#10223 from jorisvandenbossche/ARROW-12617-orc-write_table-signature Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

… no nulls Closes apache#10184 from cyb70289/12568-cast-crash Lead-authored-by: Yibo Cai <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

Closes apache#10237 from jonkeane/ARROW-12571-valgrindCI-patch Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

…rs should not be case-sensitive This makes the environment variables `LIBARROW_MINIMAL`, `LIBARROW_DOWNLOAD`, and `NOT_CRAN` case-insensitive Closes apache#10252 from ianmcook/ARROW-12642 Authored-by: Ian Cook <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

With the nvcc 11.2 compiler we have a segfault when we have a copy and move assignment operator : ``` using Impl::operator=; ``` before a move-assignment operator: ``` Variant& operator=(Variant&& other) noexcept { this->destroy(); other.move_to(this); return *this; } ``` A minimal repro : With a segfault : https://godbolt.org/z/h9eYv6zas Without a segfault : https://godbolt.org/z/oWhK5qPd8 In this PR, as a workaround, we have essentially re-ordered the move-assignment of `Variant` before `using Impl::operator=;`. Closes apache#10257 from galipremsagar/patch-2 Authored-by: GALI PREM SAGAR <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

Closes apache#10287 from pitrou/ARROW-12670-extract-regex-nulls Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Benjamin Kietzman <[email protected]>

…data Closes apache#10297 from zeroshade/flight-client-metadata Authored-by: Matthew Topol <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

…d arrays => crash fixing ARROW-12774 Closes apache#10320 from nirandaperera/ARROW-12774 Authored-by: niranda perera <[email protected]> Signed-off-by: Yibo Cai <[email protected]>

…t bundlers such as Rollup Bundlers such as Rollup do not recognize the `_Buffer` import, which breaks their builds. This change resolves this issue by removing Buffer in favor of `TextEncoder`. Note that change incurs a performance penalty on Node as `Buffer` is often faster. Co-authored-by: Adam Lippai <[email protected]> Co-authored-by: Paul Taylor <[email protected]> Closes apache#10332 from domoritz/remove-buffer-js Authored-by: Dominik Moritz <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

@trxcllnt

…ite_js_test_json The integration build has started to fail on master: https://github.com/apache/arrow/runs/2575265526#step:9:4265 I don't entirely understand the reason why we see this error, in order to call that function we would need to pass `--write_generated_json` to the archery command, but we don't. The only occurrence of that option in the codebase is in the javascript [test runner](https://github.com/apache/arrow/blob/master/js/gulp/test-task.js#L97), but that seems to use the old `integration_test.py` script which have been deleted since we ported it to archery (cc @trxcllnt @domoritz). Additionally, I'm unable to reproduce it locally since `archery integration` doesn't call `write_js_test_json` by default. The implementation is clearly wrong though. Closes apache#10314 from kszucs/integration-decimal Authored-by: Krisztián Szűcs <[email protected]> Signed-off-by: Krisztián Szűcs <[email protected]>

… > stop) When the normalized slice has a start > stop, we were creating invalid arrays with a negative length (which then errors on subsequent operations) Closes apache#10341 from jorisvandenbossche/ARROW-12769 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: David Li <[email protected]>

…set mark Closes apache#10346 from jorisvandenbossche/ARROW-12806 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: David Li <[email protected]>

There is a fallback_version configuration option for setuptools_scm which we don't use: https://github.com/pypa/setuptools_scm#configuration-parameters Although this setting seems to have issues according to pypa/setuptools-scm#549 We already have a workaround in setup.py for the functionality of the fallback_version option, but it is disabled for the case of sdist: https://github.com/apache/arrow/blob/master/python/setup.py#L529 Closes apache#10342 from kszucs/ARROW-12619 Authored-by: Krisztián Szűcs <[email protected]> Signed-off-by: Krisztián Szűcs <[email protected]>

…pes (apache#10344) This backports the relevant part of ARROW-12500 into the 4.0.1 branch. While ARROW-12500 cherry-picks cleanly, it doesn't build since it depends on prior changes - this just includes the actual fix and not the larger refactoring that was the focus of the patch.

[Release] Apache Arrow Release 4.0.1

alamb and others added 30 commits March 31, 2021 13:36

ARROW-11858: [GLib][Gandiva] Add Gandiva::Filter and related functions

223a067

Closes apache#9856 from kou/glib-gandiva-filter Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

ARROW-12171: [Rust] clean up clippy lints

cc3bcf8

Clean up some things that `clippy` was complaining to me locally about Closes apache#9867 from alamb/alamb/rust-clippy Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

ARROW-12097: [C++] Modify BackgroundGenerator so it creates fewer thr…

24e3b7f

…eads Closes apache#9808 from westonpace/feature/arrow-12097 Authored-by: Weston Pace <[email protected]> Signed-off-by: David Li <[email protected]>

ARROW-12173: [GLib] Remove #include <config.h>

3679132

It's for GNU Autotools. Closes apache#9869 from kou/glib-remove-config Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

ARROW-12194: [Rust][Parquet] Bump zstd to v0.7

ebe4295

This updates zstd version used by parquet crate to zstd = "0.7.0+zstd.1.4.9". Closes apache#9881 from aldanor/feature/zstd-0.7 Authored-by: Ivan Smirnov <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>

ARROW-12034: [Developer Tools] Formalize Minor PRs

5cabd31

Closes apache#9763 from emkornfield/trivial_prs Lead-authored-by: Micah Kornfield <[email protected]> Co-authored-by: emkornfield <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

ARROW-11338: [R] Bindings for quantile and median

19718f8

This adds `quantile()` and `median()` methods for `ArrowDatum` Closes apache#9875 from ianmcook/ARROW-11338 Authored-by: Ian Cook <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

kszucs and others added 25 commits April 21, 2021 18:11

[Release] Update CHANGELOG.md for 4.0.0

15b5d8e

[Release] Update .deb/.rpm changelogs for 4.0.0

6d5f468

[Release] Update versions for 4.0.0

f959141

ARROW-12520: [R] Minor docs updates

8004518

Closes apache#10143 from jonkeane/ARROW-12520 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Neal Richardson <[email protected]>

ARROW-12601: [R][Packaging] Fix pkg-config check in r/configure

b1303cd

Fixes false positives in a check that pkg-config is installed Closes apache#10198 from ianmcook/ARROW-12601 Authored-by: Ian Cook <[email protected]> Signed-off-by: Ian Cook <[email protected]>

ARROW-12617: [Python] Align orc.write_table keyword order with parque…

f80292d

…t.write_table Closes apache#10223 from jorisvandenbossche/ARROW-12617-orc-write_table-signature Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

ARROW-12568: [C++][Compute] Fix nullptr deference when array contains…

63954cb

… no nulls Closes apache#10184 from cyb70289/12568-cast-crash Lead-authored-by: Yibo Cai <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>

ARROW-12571: [R][CI] Run nightly R with valgrind

be5f704

Closes apache#10237 from jonkeane/ARROW-12571-valgrindCI-patch Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

ARROW-12670: [C++] Fix extract_regex output after non-matching values

af476c6

Closes apache#10287 from pitrou/ARROW-12670-extract-regex-nulls Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Benjamin Kietzman <[email protected]>

ARROW-12746: [Go][Flight] append instead of overwriting outgoing meta…

39dd06b

…data Closes apache#10297 from zeroshade/flight-client-metadata Authored-by: Matthew Topol <[email protected]> Signed-off-by: Micah Kornfield <[email protected]>

ARROW-12774 : [C++][Compute] replace_substring_regex() creates invali…

efdfb6d

…d arrays => crash fixing ARROW-12774 Closes apache#10320 from nirandaperera/ARROW-12774 Authored-by: niranda perera <[email protected]> Signed-off-by: Yibo Cai <[email protected]>

ARROW-12806: [Python] test_write_to_dataset_filesystem missing a data…

577863f

…set mark Closes apache#10346 from jorisvandenbossche/ARROW-12806 Authored-by: Joris Van den Bossche <[email protected]> Signed-off-by: David Li <[email protected]>

[Release] Update CHANGELOG.md for 4.0.1

7dcc554

[Release] Update .deb/.rpm changelogs for 4.0.1

126715f

[Release] Update versions for 4.0.1

81ff679

Merge tag 'apache-arrow-4.0.1' of github.com:apache/arrow into update

29f8fe8

[Release] Apache Arrow Release 4.0.1

github-actions bot added lang-rust datafusion labels Jun 1, 2021

Avogar merged commit debf751 into ClickHouse:master Jun 1, 2021

This was referenced Oct 15, 2024

[Don't merge] Upgrade the Arrow to 16.1.0 #67

Closed

Apply patches to the Arrow to 16.1.0 #68

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Arrow to apache-arrow-4.0.1#6

Update Arrow to apache-arrow-4.0.1#6
Avogar merged 1389 commits intoClickHouse:masterfrom
Avogar:update

Avogar commented Jun 1, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

Avogar commented Jun 1, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants