Skip to content

Should the (field) equality operation be noncommutative? #133

@sadielbartholomew

Description

@sadielbartholomew

There are cases where a.equals(b) evaluates differently to b.equals(a), at least with a and b being fields as I recently noticed during the work towards append mode (#69), namely for cases where the fields in question are the same except for one having a component missing. In other words, our equals method appears to not be commutative/symmetric for certain operands.

See below for details of the particular case which I observed to give different results (confirmed on the master) depending on the order of fields as class upon which the method acts or the parameter.

This raises some questions for me, because I found the difference in output of a * b and b * a confusing given equality in a logical sense should, to me (e.g. it's certainly the case in a mathematical sense), imply commutative behaviour. The relevant parts of the documentation did not seem to provide any information or clues as to whether the equality method should be symmetric or not, but I thought:

Equality is strict by default.

suggests it should, though perhaps (my emphasis):

Any type of object may be tested but, in general, equality is only possible with another object of the same type, or a subclass of one

could be relevant?

My questions are:

  1. Is a difference in result for a.equals(b) and b.equals(a) something that should be possible, particularly in cases such as that outlined below, or is it a bug we should fix?
  2. In either case, I think we should add a few lines to the documentation to explicitly outline whether these cases are possible and for what inputs and what constructs the equals method is bound to, so there is no ambiguity.
  3. If it is a bug and we ensure symmetrical behaviour, could we and should we make it configurable so a field and another that is the same but reduced can be treated as equal if users desire in some context, e.g. with a kwarg called something like accept_subset?

Example case

Note this example with one field a and a field b that is the same but missing a time dimension coordinate:

...
>>> a.dump()
------------------------------------------------------------------
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Conventions = 'CF-1.8'
standard_name = 'air_potential_temperature'
units = 'K'

Data(time(36), latitude(5), longitude(8)) = [[[210.7, ..., 286.6]]] K

Cell Method: area: mean

Domain Axis: air_pressure(1)
Domain Axis: latitude(5)
Domain Axis: longitude(8)
Domain Axis: time(36)

Dimension coordinate: time
    standard_name = 'time'
    units = 'days since 1959-01-01'
    Data(time(36)) = [1959-12-16 12:00:00, ..., 1962-11-16 00:00:00]
    Bounds:Data(time(36), 2) = [[1959-12-01 00:00:00, ..., 1962-12-01 00:00:00]]

Dimension coordinate: latitude
    standard_name = 'latitude'
    units = 'degrees_north'
    Data(latitude(5)) = [-75.0, ..., 75.0] degrees_north
    Bounds:Data(latitude(5), 2) = [[-90.0, ..., 90.0]] degrees_north

Dimension coordinate: longitude
    standard_name = 'longitude'
    units = 'degrees_east'
    Data(longitude(8)) = [22.5, ..., 337.5] degrees_east
    Bounds:Data(longitude(8), 2) = [[0.0, ..., 360.0]] degrees_east

Dimension coordinate: air_pressure
    standard_name = 'air_pressure'
    units = 'hPa'
    Data(air_pressure(1)) = [850.0] hPa
>>> b.dump()
--------------------------------------------------------------------
Field: air_potential_temperature (ncvar%air_potential_temperature_1)
--------------------------------------------------------------------
Conventions = 'CF-1.8'
standard_name = 'air_potential_temperature'
units = 'K'

Data(ncdim%time_1(36), latitude(5), longitude(8)) = [[[210.7, ..., 286.6]]] K

Cell Method: area: mean

Domain Axis: air_pressure(1)
Domain Axis: latitude(5)
Domain Axis: longitude(8)
Domain Axis: ncdim%time_1(36)

Dimension coordinate: latitude
    standard_name = 'latitude'
    units = 'degrees_north'
    Data(latitude(5)) = [-75.0, ..., 75.0] degrees_north
    Bounds:Data(latitude(5), 2) = [[-90.0, ..., 90.0]] degrees_north

Dimension coordinate: longitude
    standard_name = 'longitude'
    units = 'degrees_east'
    Data(longitude(8)) = [22.5, ..., 337.5] degrees_east
    Bounds:Data(longitude(8), 2) = [[0.0, ..., 360.0]] degrees_east

Dimension coordinate: air_pressure
    standard_name = 'air_pressure'
    units = 'hPa'
    Data(air_pressure(1)) = [850.0] hPa
>>> a.equals(b)
False
>>> b.equals(a)
True
>>> a.equals(b, verbose=-1)
Constructs: Comparing <DimensionCoordinate: time(36) days since 1959-01-01 >, <DimensionCoordinate: latitude(5) degrees_north>: 
Constructs: Can't match constructs spanning axes ['time']
Constructs: Can't match <DimensionCoordinate: time(36) days since 1959-01-01 >
Constructs: Can't match <DimensionCoordinate: time(36) days since 1959-01-01 >
Constructs: Can't match <DimensionCoordinate: time(36) days since 1959-01-01 >
Field: Different metadata constructs
False

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions