Skip to content

Question regarding precision of datetime arithmetic #109

@spencerkclark

Description

@spencerkclark

We are working on a cftime-compatible version of pandas' resample functionality in xarray (pydata/xarray#2593). In order to accomplish this, we are making heavy use of the datetime arithmetic functionality in cftime. There are two contexts where this occurs:

  • Adding a datetime.timedelta object do a cftime.datetime object to produce another cftime.datetime object
  • Taking the difference between two cftime.datetime objects to produce a datetime.timedelta object.

It is my understanding that in the first context, integer arithmetic is used, so the result is microsecond-precise; in the second context, however, things follow a different code path (using date2num) and arithmetic is not necessarily exact (in my tests on a Mac this issue seems to have mostly gone away in cftime version 1.0.3.4; however @jwenfai has found on Windows that issues persist).

We are finding that having an exact result for the difference between two datetimes is important for our new use-case. A potential solution one might think of trying is to try to write a function that computes exact datetime differences.

Here is a potential function I came up with. Do you think that this would be a safe workaround for this issue, or might there be issues I'm not anticipating? Thanks for your help.

from datetime import timedelta

def exact_cftime_datetime_difference(a, b):
    """Exact computation of b - a

    Assumes:

        a = a_0 + a_m
        b = b_0 + b_m

    Here a_0, and b_0 represent the input dates rounded 
    down to the nearest second, and a_m, and b_m represent
    the remaining microseconds associated with date a and
    date b.

    We can then express the value of b - a as:

        b - a = (b_0 + b_m) - (a_0 + a_m) = b_0 - a_0 + b_m - a_m
    
    By construction, we know that b_0 - a_0 must be a round number
    of seconds.  Therefore we can take the result of b_0 - a_0 using
    ordinary cftime.datetime arithmetic and round to the nearest
    second.  b_m - a_m is the remainder, in microseconds, and we
    can simply add this to the rounded timedelta.

    Parameters
    ----------
    a : cftime.datetime
        Input datetime
    b : cftime.datetime
        Input datetime

    Returns
    -------
    datetime.timedelta
    """
    seconds = b.replace(microsecond=0) - a.replace(microsecond=0)
    seconds = int(round(seconds.total_seconds()))
    microseconds = b.microsecond - a.microsecond
    return timedelta(seconds=seconds, microseconds=microseconds)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions