Skip to content

xr.concat concatenates along dimensions that it wasn't asked to #8231

@TomNicholas

Description

@TomNicholas

What happened?

Here are two toy datasets designed to represent sections of a dataset that has variables living on a staggered grid. This type of dataset is common in fluid modelling (it's why xGCM exists).

import xarray as xr

ds1 = xr.Dataset(
    coords={
        'x_center': ('x_center', [1, 2, 3]),
        'x_outer':  ('x_outer',  [0.5, 1.5, 2.5, 3.5]),  
    },
)

ds2 = xr.Dataset(
    coords={
        'x_center': ('x_center', [4, 5, 6]),
        'x_outer':  ('x_outer',  [4.5, 5.5, 6.5]),  
    },
)

Calling xr.concat on these with dim='x_center' happily concatenates them

xr.concat([ds1, ds2], dim='x_center')
<xarray.Dataset>
Dimensions:   (x_outer: 7, x_center: 6)
Coordinates:
  * x_outer   (x_outer) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5
  * x_center  (x_center) int64 1 2 3 4 5 6
Data variables:
    *empty*

but notice that the returned result has been concatenated along both x_center and x_outer.

What did you expect to happen?

I did not expect this to work. I definitely didn't expect the datasets to be concatenated along a dimension I didn't ask them to be concatenated along (i.e. x_outer).

What I expected to happen was that (as by default coords='different') both variables would be attempted to be concatenated along the x_center dimension, which would have succeeded for the x_center variable but failed for the x_outer variable. Indeed, if I name the variables differently so that they are no longer coordinate variables then that is what happens:

import xarray as xr

ds1 = xr.Dataset(
    data_vars={
        'a': ('x_center', [1, 2, 3]),
        'b':  ('x_outer',  [0.5, 1.5, 2.5, 3.5]),  
    },
)

ds2 = xr.Dataset(
    data_vars={
        'a': ('x_center', [4, 5, 6]),
        'b':  ('x_outer',  [4.5, 5.5, 6.5]),  
    },
)
xr.concat([ds1, ds2], dim='x_center', data_vars='different') 
ValueError: cannot reindex or align along dimension 'x_outer' because of conflicting dimension sizes: {3, 4}

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

I was trying to create an example for which you would need the automatic combined concat/merge that happens within xr.combine_by_coords.

Environment

xarray 2023.8.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions