-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened:
The _repr_html_ method of large arrays seems very slow — 4.78s in the case of a 100m value array; and the general repr seems fairly slow — 1.87s. Here's a quick example. I haven't yet investigated how dependent it is on there being a MultiIndex.
What you expected to happen:
We should really focus on having good repr performance, given how essential it is to any REPL workflow.
Minimal Complete Verifiable Example:
In [10]: import xarray as xr
...: import numpy as np
...: import pandas as pd
In [11]: idx = pd.MultiIndex.from_product([range(10_000), range(10_000)])
In [12]: df = pd.DataFrame(range(100_000_000), index=idx)
In [13]: da = xr.DataArray(df)
In [14]: da
Out[14]:
<xarray.DataArray (dim_0: 100000000, dim_1: 1)>
array([[ 0],
[ 1],
[ 2],
...,
[99999997],
[99999998],
[99999999]])
Coordinates:
* dim_0 (dim_0) MultiIndex
- dim_0_level_0 (dim_0) int64 0 0 0 0 0 0 0 ... 9999 9999 9999 9999 9999 9999
- dim_0_level_1 (dim_0) int64 0 1 2 3 4 5 6 ... 9994 9995 9996 9997 9998 9999
* dim_1 (dim_1) int64 0
In [26]: %timeit repr(da)
1.87 s ± 7.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [27]: %timeit da._repr_html_()
4.78 s ± 1.8 s per loop (mean ± std. dev. of 7 runs, 1 loop each)Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.7 (default, Dec 30 2020, 10:13:08)
[Clang 12.0.0 (clang-1200.0.32.28)]
python-bits: 64
OS: Darwin
OS-release: 19.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None
xarray: 0.16.3.dev48+gbf0fe2ca
pandas: 1.1.3
numpy: 1.19.2
scipy: 1.5.3
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.5.0
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.30.0
distributed: None
matplotlib: 3.3.2
cartopy: None
seaborn: 0.11.0
numbagg: installed
pint: 0.16.1
setuptools: 51.1.1
pip: 20.3.3
conda: None
pytest: 6.1.1
IPython: 7.19.0
sphinx: None