GH-37597: [MATLAB] Add toMATLAB method to arrow.array.ChunkedArray class#37613
Merged
kevingurney merged 9 commits intoapache:mainfrom Sep 7, 2023
Merged
GH-37597: [MATLAB] Add toMATLAB method to arrow.array.ChunkedArray class#37613kevingurney merged 9 commits intoapache:mainfrom
toMATLAB method to arrow.array.ChunkedArray class#37613kevingurney merged 9 commits intoapache:mainfrom
Conversation
… StringType, and TimestampType
… all Numeric Types
kevingurney
approved these changes
Sep 7, 2023
Member
|
+1 |
|
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 65e2f22. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
loicalleyne
pushed a commit
to loicalleyne/arrow
that referenced
this pull request
Nov 13, 2023
…dArray` class (apache#37613) ### Rationale for this change Currently, there is no way to easily convert an `arrow.array.ChunkedArray` into a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) calling `toMATLAB` on each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array. It would be helpful to add a toMATLAB method to `arrow.array.ChunkedArray` that abstracts away all of these steps. ### What changes are included in this PR? 1. Added `toMATLAB` method to `arrow.array.ChunkedArray` class 2. Added `preallocateMATLABArray` abstract method to `arrow.type.Type` class. This method is used by the `ChunkedArray` `toMATLAB` to pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensure `toMATLAB` returns the correct MATLAB array when the `ChunkedArray` has zero chunks. If `toMATLAB` stored the result of calling `toMATLAB` on each chunk in a `cell` array before concatenating the values, `toMATLAB` would return a 0x0 `double` array for zero-chunked arrays. The pre-allocation approach avoids this issue. 3. Implement `preallocateMATLABArray` on all `arrow.type.Type` classes. 4. Added an abstract class `arrow.type.NumericType` that all classes representing numeric data types inherit from. `NumericType` implements `preallocateMATLABArray` for its subclasses. ### Are these changes tested? Yes. Added unit tests to `tChunkedArray.m`. ### Are there any user-facing changes? Yes. Users can now call `toMATLAB` on `ChunkedArray`s. **Example** ```matlab >> a = arrow.array([1 2 NaN 4 5]); >> b = arrow.array([6 7 8 9 NaN 11]); >> c = arrow.array.ChunkedArray.fromArrays(a, b); >> data = toMATLAB(c) data = 1 2 NaN 4 5 6 7 8 9 NaN 11 ``` * Closes: apache#37597 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Kevin Gurney <[email protected]>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this pull request
Feb 19, 2024
…dArray` class (apache#37613) ### Rationale for this change Currently, there is no way to easily convert an `arrow.array.ChunkedArray` into a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) calling `toMATLAB` on each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array. It would be helpful to add a toMATLAB method to `arrow.array.ChunkedArray` that abstracts away all of these steps. ### What changes are included in this PR? 1. Added `toMATLAB` method to `arrow.array.ChunkedArray` class 2. Added `preallocateMATLABArray` abstract method to `arrow.type.Type` class. This method is used by the `ChunkedArray` `toMATLAB` to pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensure `toMATLAB` returns the correct MATLAB array when the `ChunkedArray` has zero chunks. If `toMATLAB` stored the result of calling `toMATLAB` on each chunk in a `cell` array before concatenating the values, `toMATLAB` would return a 0x0 `double` array for zero-chunked arrays. The pre-allocation approach avoids this issue. 3. Implement `preallocateMATLABArray` on all `arrow.type.Type` classes. 4. Added an abstract class `arrow.type.NumericType` that all classes representing numeric data types inherit from. `NumericType` implements `preallocateMATLABArray` for its subclasses. ### Are these changes tested? Yes. Added unit tests to `tChunkedArray.m`. ### Are there any user-facing changes? Yes. Users can now call `toMATLAB` on `ChunkedArray`s. **Example** ```matlab >> a = arrow.array([1 2 NaN 4 5]); >> b = arrow.array([6 7 8 9 NaN 11]); >> c = arrow.array.ChunkedArray.fromArrays(a, b); >> data = toMATLAB(c) data = 1 2 NaN 4 5 6 7 8 9 NaN 11 ``` * Closes: apache#37597 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Kevin Gurney <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
Currently, there is no way to easily convert an
arrow.array.ChunkedArrayinto a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) callingtoMATLABon each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array.It would be helpful to add a toMATLAB method to
arrow.array.ChunkedArraythat abstracts away all of these steps.What changes are included in this PR?
toMATLABmethod toarrow.array.ChunkedArrayclasspreallocateMATLABArrayabstract method toarrow.type.Typeclass. This method is used by theChunkedArraytoMATLABto pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensuretoMATLABreturns the correct MATLAB array when theChunkedArrayhas zero chunks. IftoMATLABstored the result of callingtoMATLABon each chunk in acellarray before concatenating the values,toMATLABwould return a 0x0doublearray for zero-chunked arrays. The pre-allocation approach avoids this issue.preallocateMATLABArrayon allarrow.type.Typeclasses.arrow.type.NumericTypethat all classes representing numeric data types inherit from.NumericTypeimplementspreallocateMATLABArrayfor its subclasses.Are these changes tested?
Yes. Added unit tests to
tChunkedArray.m.Are there any user-facing changes?
Yes. Users can now call
toMATLABonChunkedArrays.Example
toMATLABmethod toarrow.array.ChunkedArrayclass #37597