Skip to content

Ensure time_to_convert bins stay uniform#5316

Merged
EDsCODE merged 2 commits intomasterfrom
timetoconvert_bins
Jul 23, 2021
Merged

Ensure time_to_convert bins stay uniform#5316
EDsCODE merged 2 commits intomasterfrom
timetoconvert_bins

Conversation

@neilkakkar
Copy link
Copy Markdown
Contributor

Changes

https://posthog.slack.com/archives/C01G8S5T08Z/p1626980526060300?thread_ts=1626980124.058400&cid=C01G8S5T08Z - sometimes, the bin sizes can stretch out unreasonably if the time to convert values are incorrect (similar to #5116 ).

This ensures we discard those values, and stick to the "nice" buckets, which ensures the frontend doesn't look like: https://posthog.slack.com/files/U01SAS8J92Q/F029M7CN7LY/screen_cast_2021-07-22_at_2.43.10_pm.gif

Checklist

  • All querysets/queries filter by Organization, by Team, and by User
  • Django backend tests
  • Jest frontend tests
  • Cypress end-to-end tests
  • Migrations are safe to run at scale (e.g. PostHog Cloud) – present proof if not obvious
  • New/changed UI is decent on smartphones (viewport width around 360px)

@neilkakkar neilkakkar requested review from EDsCODE and macobo July 23, 2021 12:20
@macobo
Copy link
Copy Markdown
Contributor

macobo commented Jul 23, 2021

Me, with 0% context on the query:

Full outer joins and right outer joins are semantically really different. What is on the LEFT, what is the historic context for FULL OUTER JOIN, why is it incorrect and could it be captured in a test?

@neilkakkar
Copy link
Copy Markdown
Contributor Author

The way this query is constructed, right & full outer join should be equivalent: On the left, you have bin start times & counts, on the right you have all possible bin start times. (counts are filled to 0).

(left outer join is not equivalent because some bins might be empty on the left, since 0 count, which aren't captured, and we aim to return a uniform bucket list)

With full outer join, since we union left and right, some discrepancies in left might come, like the conversion times being outside of our calculated range (because of the sporadic weirdness in conversion times - step_runs CTE being run again here). This isn't possible on the right, since it's counting all the bins from to with the same pre-calculated range.

Hard to capture in a test (intermittent. I haven't seen it happen except in the slack example), but since all 6 existing tests pass, I can sortof convince you it's equivalent.

@EDsCODE EDsCODE merged commit 259d12d into master Jul 23, 2021
@EDsCODE EDsCODE deleted the timetoconvert_bins branch July 23, 2021 14:40
neilkakkar added a commit that referenced this pull request Jul 29, 2021
macobo pushed a commit that referenced this pull request Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants