Conversation
texodus
approved these changes
Jan 15, 2020
Member
texodus
left a comment
There was a problem hiding this comment.
Thanks for the PR! Awesome work and great write up!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements and tests
perspective-python's time zone semantics, and offers an explanation to how Perspective handles time zones.The Problem
In the browser use case, time zones do not pose a problem as all times are localized to the browser's time zone. When running
perspective-pythonon the server, however, the server may not be in the same time zone as the client, and time zone handling must be defined.Currently,
perspective-pythonmakes the assumption that alldatetimeandTimestampobjects are defined in local time, and thetzinfoattribute is ignored. Internally, the C++ engine stores datetime values as Unix timestamps in milliseconds since epoch, and the conversion fromdatetimeto Unix timestamps is not time zone aware.When datetime values are serialized through the use of
to_dict,to_records, etc., PyBind treats the Unix timestamp as local time, and the resultingdatetimeobject that is created has notzinfoattribute and is in local time.When the data from Python is consumed by a
perspective-viewerin the browser, it is serialized back to a Unix timestamp before being sent over the network, and then the browser callsnew Date()on the timestamp value to create the final representation inside the browser.Because
new Date()treats the timestamp as local time on the browser, which could be different than local time on the server, there could exist a difference between the values passed intoperspective-pythonand the values a user sees inperspective-viewer, made more difficult by the unclear and unspecified semantics around datetime and time zone handling. This PR attempts to codify time zone semantics and provide an explanation to the behavior.The Solution
This PR modifies the date validator in Python by converting all time-zone aware
datetimeandTimestampobjects (or any object passed in that has thetzinfoattribute set) to UTC before storing the timestamp into Perspective. Naive datetimes are assumed to be in local time, and will be processed as-is without conversion.When the timestamp is serialized, it is converted (via Pybind) to local time as determined by the Python runtime.
perspective-viewerwill continue to treatnew Date()in local time as determined by the browser.This implementation does not change the behavior of
perspective-pythonfor naive datetimes, but it allows us to use aware datetimes and view them in local time:The same datetime, but as a naive datetime:
Caveat: Pandas DataFrames
For Pandas DataFrames, any
datetime64columns will always be treated as UTC and always serialized from Perspective in local time. For timestamps to be serialized exactly as they were entered, usetz_localizeortz_convertfromPandasin order to localize the Timestamps into your local timezone:Changelog
datetimeandpandas.Timestampobjects to UTCpandas.Timestampsfor all of the above