cql, schema: Extend keyspace, table, views, indexes name length limit from 48 to 192 bytes#24500
Conversation
fd6d7d0 to
d953be7
Compare
|
Congrats on your 1st PR! 👏 A couple of general comments from me. It's not customary to put issue number into the title of the PR, but rather a module name or scope name. Also, following my suggestion in the issue thread, let's rename the PR accordingly. Furthermore, you write in the cover letter, that
Could you name a few such scenarios, just for clarity? Or should it be changed into some usage scenarios could benefit from relaxing this constraint? Otherwise, congrats on the cover letter, it's very communicative! |
|
Also, a PR requires a backport label. This current one IMO fits into |
d953be7 to
c6df135
Compare
6212bdc to
4fe29ff
Compare
Fixed
Done |
Set backport/2025.2 as @swasik did in #24295
Fixed
Done in pull request description only. |
ScyllaPiotr
left a comment
There was a problem hiding this comment.
Very nice suite of tests and corrections, congrats!
Below you'll find some questions and remarks.
@swasik @ewienik do you intend to backport VS to 2025.2? The current justification for backport is: needed to make ScyllaDB work with some of the AI workflows. |
Vector Search is a new feature, should not be backported. This issue is not a part of Vector Search, I assume there are other AI worflows which needs to be supported in previous versions. |
ec85f67 to
5caebf0
Compare
094892c to
43721df
Compare
@nyh PR prepared https://github.com/scylladb/scylla-dtest/pull/5975. Is it possible to rerun CI on this PR using dtest from https://github.com/scylladb/scylla-dtest/pull/5975 branch? |
Thanks. I merged your dtest PR. I propose that we just wait until that change promotes in the dtest repository, and then you can rerun this CI without any special trickery. It's possible to tell CI to use a different branch of dtest, but I don't want to do that, because if I do that and merge this PR - then the "next promotion" stage, which also runs dtests, will fail. Let's just do it the slow and sure way. |
15 whole bytes! Ok. |
🔴 CI State: FAILURE✅ - Framework test Failed Tests (3/41995):Build Details:
|
🟢 CI State: SUCCESS❌ - Framework test Build Details:
|
So how the framework test failed. This and this show only |
🟢 CI State: SUCCESS✅ - Framework test Build Details:
|
|
After manually triggered rebuild on Jenkins dtest related jobs with |
@avikivity can you please be more explicit if you think this number 192 is ok, or not really ok? Note that the real filesystem limit is 222, so we actually left 30 bytes "spare". CDC uses 15 out of these spare, but it's not additive - if some other feature, e.g., Paxos or Feature X needs to create its own table, it can also use the whole extra 30 bytes in the new table's name and it doesn't care that CDC used a 15 byte suffix. |
It's ok.
Right. |
…h limit from 48 to 192 bytes' from Karol Nowacki
cql, schema: Extend name length limit from 48 to 192 bytes
This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes.
The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389)
and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint.
This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases.
The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data.
When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID.
For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name.
The directory name for this log table becomes the longest possible representation.
Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas.
To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows:
255 bytes (common filesystem limit for a path component)
- 32 bytes (for the 32-character UUID string)
- 1 byte (for the '-' separator)
- 15 bytes (for the '_scylla_cdc_log' suffix)
- 15 bytes (reserved for future use)
----------
= 192 bytes (Maximum allowed name length)
This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038).
This patch also updates/adds all associated tests to validate the new 192-byte limit.
The documentation has been updated accordingly.
Fixes #4480
Backport 2025.2: The significantly shorter maximum table name length in Scylla compared to Cassandra is becoming a more common issue for users in the latest release.
Closes #24500
* github.com:scylladb/scylladb:
cql, schema: Extend name length limit from 48 to 192 bytes
replica: Remove unused keyspace::init_storage()
|
|
|
|
Yes, it was for external framework that allows implementing feature stores using various DBs: https://github.com/featureform/featureform - we want to have working integration ASAP and that is why the backport is needed. |
…indexes name length limit from 48 to 192 bytes' from Scylladb[bot]
cql, schema: Extend name length limit from 48 to 192 bytes
This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes.
The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389)
and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint.
This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases.
The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data.
When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID.
For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name.
The directory name for this log table becomes the longest possible representation.
Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas.
To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows:
255 bytes (common filesystem limit for a path component)
- 32 bytes (for the 32-character UUID string)
- 1 byte (for the '-' separator)
- 15 bytes (for the '_scylla_cdc_log' suffix)
- 15 bytes (reserved for future use)
----------
= 192 bytes (Maximum allowed name length)
This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038).
This patch also updates/adds all associated tests to validate the new 192-byte limit.
The documentation has been updated accordingly.
Fixes #4480
Backport 2025.2: The significantly shorter maximum table name length in Scylla compared to Cassandra is becoming a more common issue for users in the latest release.
- (cherry picked from commit a41c12c)
- (cherry picked from commit 4577c66)
Parent PR: #24500
Closes #24603
* github.com:scylladb/scylladb:
cql, schema: Extend name length limit from 48 to 192 bytes
replica: Remove unused keyspace::init_storage()
Fixes #4480
Backport 2025.2: The significantly shorter maximum table name length in Scylla compared to Cassandra is becoming a more common issue for users in the latest release.