Skip to content

Comments

Add NumPy and Pandas TypeSystems#257

Merged
JacobHayes merged 4 commits intogoldenfrom
numpy-pandas-type-systems
Jul 27, 2022
Merged

Add NumPy and Pandas TypeSystems#257
JacobHayes merged 4 commits intogoldenfrom
numpy-pandas-type-systems

Conversation

@JacobHayes
Copy link
Member

Description

Adds TypeSystems for numpy and pandas. There are things missing (eg: pandas (multi)indexes), but this will be a good start from which we can refine.

The packages are marked as optional dependencies.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

New tests were added to maintain 100% coverage.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in upstream modules

@JacobHayes JacobHayes self-assigned this Jul 25, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 25, 2022

Codecov Report

Merging #257 (fef1e99) into golden (19c4967) will not change coverage.
The diff coverage is 100.00%.

❗ Current head fef1e99 differs from pull request most recent head 2487b72. Consider uploading reports for the commit 2487b72 to get more accurate results

@@            Coverage Diff            @@
##            golden      #257   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           39        41    +2     
  Lines         2448      2535   +87     
  Branches       513       523   +10     
=========================================
+ Hits          2448      2535   +87     
Impacted Files Coverage Δ
src/arti/types/numpy.py 100.00% <100.00%> (ø)
src/arti/types/pandas.py 100.00% <100.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more



@pandas_type_system.register_adapter
class DataFrameAdapter(TypeAdapter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this handle a series of objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part is handled in the SeriesAdapter; dtype("O") strings will be handled alright, but other types currently raise an error:

if dtype == np.dtype("O"):
# TODO: Should we handle empty series by defaulting to "String", but issuing
# a warning?
example_value = type_.iloc[0]
if isinstance(example_value, str):
return List(element=String())
# TODO: Handle dicts, lists, etc.
raise NotImplementedError(
f"Non-string {dtype} is not supported yet, got values of: {example_value}"
)

I think for complex cell values, we may want to try to convert into a python type annotation (eg: list[str] or list[dict[str, str]]) and pass off to the python_type_system. I'm not quite sure whether that "value -> type annotation" logic should live here or in the python_type_system though (probably leaning towards the latter with some sort of "infer_type" logic?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I added a (failing) test and #258 for this.

Signed-off-by: Jacob Hayes <[email protected]>
Signed-off-by: Jacob Hayes <[email protected]>
@JacobHayes JacobHayes merged commit 2573dc3 into golden Jul 27, 2022
@JacobHayes JacobHayes deleted the numpy-pandas-type-systems branch July 27, 2022 02:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants