-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Sample by 1M align to calendar with negative offset groups data incorrectly #6301
Copy link
Copy link
Closed
Closed
Copy link
Description
To reproduce
It seems the OFFSET option is interacting incorrectly with the alignment to calendar for Month timeframe.
See the two minimal replication examples
-- Query 2: POSITIVE OFFSET -> will return the expected 6 rows of data
SELECT
date,
COUNT(*) AS row_count,
SUM(value) AS total_value
FROM (
SELECT cast('2024-01-15T12:00:00.000000Z' as timestamp) as date, 100.0 as value FROM long_sequence(1)
UNION ALL
SELECT cast('2024-02-15T12:00:00.000000Z' as timestamp), 200.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-03-15T12:00:00.000000Z' as timestamp), 300.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-04-15T12:00:00.000000Z' as timestamp), 400.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-05-15T12:00:00.000000Z' as timestamp), 500.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-06-15T12:00:00.000000Z' as timestamp), 600.0 FROM long_sequence(1)
ORDER BY date
)
SAMPLE BY 1M ALIGN TO CALENDAR WITH OFFSET '00:05';
-- Returns
-- date row_count total_value
-- 2024-01-01T00:05:00.000000Z 1 100.0
-- 2024-02-01T00:05:00.000000Z 1 200.0
-- 2024-03-01T00:05:00.000000Z 1 300.0
-- 2024-04-01T00:05:00.000000Z 1 400.0
-- 2024-05-01T00:05:00.000000Z 1 500.0
-- 2024-06-01T00:05:00.000000Z 1 600.0
Query 1: NEGATIVE OFFSET -> this will return only 3 rows of data (1 every 2 months) where it groups the data of two months together
SELECT
date,
COUNT(*) AS row_count,
SUM(value) AS total_value
FROM (
SELECT cast('2024-01-15T12:00:00.000000Z' as timestamp) as date, 100.0 as value FROM long_sequence(1)
UNION ALL
SELECT cast('2024-02-15T12:00:00.000000Z' as timestamp), 200.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-03-15T12:00:00.000000Z' as timestamp), 300.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-04-15T12:00:00.000000Z' as timestamp), 400.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-05-15T12:00:00.000000Z' as timestamp), 500.0 FROM long_sequence(1)
UNION ALL
SELECT cast('2024-06-15T12:00:00.000000Z' as timestamp), 600.0 FROM long_sequence(1)
ORDER BY date
)
SAMPLE BY 1M ALIGN TO CALENDAR WITH OFFSET '-00:05';
-- Returns
-- date row_count total_value
-- 2024-01-31T23:55:00.000000Z 2 300.0
-- 2024-03-31T23:55:00.000000Z 2 700.0
-- 2024-05-31T23:55:00.000000Z 2 1100.0
I've not seen any mention of this bug being fixed in any more recent versions in the release notes, so figured worth posting here.
QuestDB version:
8.1.4
UPDATE also true in version 9.1.2
OS, in case of Docker specify Docker and the Host OS:
Docker version 20.10.27, Container-Optimized OS from Google version 101
File System, in case of Docker specify Host File System:
unsure
Full Name:
Jonas De Beukelaer
Affiliation:
Prediko
Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?
- Yes, I have
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels