alternator: improve, document and test table/index name lengths#24597
alternator: improve, document and test table/index name lengths#24597scylladb-promoter merged 2 commits intoscylladb:masterfrom
Conversation
Docs Preview 📖Docs Preview for this pull request is available here Changed Files: Note: This preview will be available for 30 days and will be automatically deleted after that period. You can manually trigger a new build by committing changes. |
🟢 CI State: SUCCESS✅ - Framework test Build Details:
|
|
|
||
| from test.alternator.util import unique_table_name, create_test_table, new_test_table, random_string, freeze, list_tables | ||
| from .util import unique_table_name, create_test_table, new_test_table, random_string, freeze, list_tables, unique_table_name | ||
|
|
There was a problem hiding this comment.
unique_table_name is imported twice
|
Pushed new version just rebased and fixed one silly double-import of the same function |
|
@scylladb/releng 15 hours later, the CI didn't complete here. I guess there's some problem. |
Yes, basically the CI passed all the relevant steps - https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/17845/ |
ScyllaPiotr
left a comment
There was a problem hiding this comment.
Congrats on the bug with enabling streams!
I'm requesting changes because there is a tiny bug here (or I'm wrong).
Other than that, only small suggestions from my side.
| } | ||
| std::string ret = std::string(table_name) + delim + std::string(index_name); | ||
| if (ret.length() > max_table_name_length) { | ||
| if (ret.length() > max_table_name_length && validate) { |
There was a problem hiding this comment.
I would reorder the operands, but honestly I'm being picky here.
There was a problem hiding this comment.
I sort-of deliberately chose this order, because validate is almost always true (previously validate=true) and the first condition is almost always false, so it's a tiny tiny tiny bit faster to notice immediately that the first condition is false instead of checking first that validate==true and only then checking the condition.
| // The view_name() function assumes the table_name has already been validated | ||
| // but validates the legality of index_name and the combination of both. | ||
| static std::string view_name(std::string_view table_name, std::string_view index_name, const std::string& delim = ":") { | ||
| static std::string view_name(std::string_view table_name, std::string_view index_name, const std::string& delim = ":", bool validate = true) { |
There was a problem hiding this comment.
The function validates the name chars and name length. Since validate pertains only to length, let's call it validate_view_name_len.
There was a problem hiding this comment.
I'll just call this one (and below) validate_len. I'm not a fan of super-long names, and the "view_name" (and below, "lsi_name") is redundant because that's the name of this function
There was a problem hiding this comment.
With table_name and index_name also here it's still not immediately clear, but I'm close to being unreasonable on this. So let's go your way.
|
|
||
| static std::string lsi_name(std::string_view table_name, std::string_view index_name) { | ||
| return view_name(table_name, index_name, "!:"); | ||
| static std::string lsi_name(std::string_view table_name, std::string_view index_name, bool validate = true) { |
There was a problem hiding this comment.
Let's rename validate into validate_lsi_name_len.
There was a problem hiding this comment.
Renamed to validate_len ("lsi_name" is already the function's name, it doesn't help to repeat it).
| "GSI {} already exists in table {}", index_name, table_name)); | ||
| } | ||
| if (p.local().data_dictionary().has_schema(keyspace_name, lsi_name(table_name, index_name))) { | ||
| if (p.local().data_dictionary().has_schema(keyspace_name, lsi_name(table_name, index_name, false))) { |
There was a problem hiding this comment.
I don't get it, why do you make executor::update_table stop checking the resulting lsi name length altogether?
There was a problem hiding this comment.
The code adds a GSI (you can only add a GSI in UpdateTable) so it already validates the validity of the GSI name. It was never the intention of this code to check if it's a valid LSI name and report an error about a bad LSI name. The only intention was to try to check if an LSI of this name exists (and if it does, report an error). So we want to look up what an LSI with that name have as its name, without validation.
Note that whether or not the lsi_name() function validates valid characters is not important - we already validated them in gsi_name() earlier. The only place this change ever matter is when the length of the table name plus index name plus 1 is exactly 222 - because this is a valid GSI name but not a valid LSI name (because of the silly reason that we used two characters instead of one as a separator for LSIs).
There was a problem hiding this comment.
That's clear, thanks.
Although it would be more prudent to design the lsi_name() with a parameter bool validate (the name you had there before) that would enable/disable all validations, not only the length - ultimately that's what is aimed here - just gimme name, no validation!
| # If we subtract from 222 the table's name length (we assume that | ||
| # unique_table_name() always returns the same length) and an extra 1, | ||
| # this is how long the GSI's name may be: | ||
| max = 222 - len(unique_table_name()) - 1 |
There was a problem hiding this comment.
Let's call max max_gsi_name_len instead.
There was a problem hiding this comment.
I think "max_gsi_name_len" sounds too general - it sounds like it might be the maximum allowed gsi name length for all GSIs, which isn't true - this is only the maximum for GSIs of this specific table (the formula subtract this table's length from it).
In any case, I think that after a 3-line comment explaining what "max" is, a shorter name is fine. The comment explains much better than a name of any length could have explained.
There was a problem hiding this comment.
"max_gsi_name_len" sounds too general - it sounds like it might be the maximum allowed gsi name length for all GSIs
I don't agree, the variable is local, so is its context. And it's better than the general max.
There was a problem hiding this comment.
This test has 4 lines of code, and 3 lines of comment explaining what "max" is. I honestly don't understand why its name matters.
But moreover, I argue that "max_gsi_name_len" is a wrong name, because it suggests there is some "maximum gsi name length" while there isn't such a thing, there is only some maximum size related to this test. I prefer the reader doesn't know exactly which "max" this is an reads the comment, over the user thinking the variable name is self explanatory and coming to the wrong conclusion!
|
|
||
| def test_gsi_updatetable_very_long_name_256(table1): | ||
| with pytest.raises(ClientError, match='ValidationException'): | ||
| add_and_delete_gsi(table1, 'n' * 255) |
There was a problem hiding this comment.
Yikes, you're right. Turns out I forgot to run this specific test on AWS, so I missed this copy-pasto :-( The test as written with 255 fails on AWS, because GSI name length 255 is allowed on DynamoDB, not forbidden. (on Scylla it's forbidden - even shorter names are forbidden - so this test passes.
There was a problem hiding this comment.
I suspected that in the end you did not run this against AWS, otherwise you would have caught that yourself.
| def test_gsi_updatetable_very_long_name_222(table1, scylla_only): | ||
| # If we subtract from 222 the table's name length and an extra 1, | ||
| # this is how long the GSI's name may be: | ||
| max = 222 - len(table1.name) - 1 |
There was a problem hiding this comment.
Let's call max max_gsi_name_len instead.
There was a problem hiding this comment.
See my answer above about "max"
| # If we subtract from 222 the table's name length (we assume that | ||
| # unique_table_name() always returns the same length) and an extra 2, | ||
| # this is how long the LSI's name may be: | ||
| max = 222 - len(unique_table_name()) - 2 |
There was a problem hiding this comment.
Let's call max max_lsi_name_len instead.
| create_and_delete_table(dynamodb, 'n' * 255) | ||
| def test_create_and_delete_table_256(dynamodb): | ||
| with pytest.raises(ClientError, match='ValidationException'): | ||
| create_table(dynamodb, 'n' * 256) |
There was a problem hiding this comment.
Why not create_and_delete_table here too, especially that the top-level function name promises it? I know this will raise, but suppose it does not raise for some reason.
There was a problem hiding this comment.
Yes, makes sense. I don't remember why I did it this way.
|
Pushed a new version addressing some of the review comments (and answering others in the review). Also I hope that the new version will finally get a working CI. |
annastuchlik
left a comment
There was a problem hiding this comment.
The doc update looks good.
ScyllaPiotr
left a comment
There was a problem hiding this comment.
With no bug anymore, no reason to hold this back. The open comments are subjective, so I leave it to @nyh to agree with them or not.
|
I'd really like this patch to be merged but I see the CI never completed. @scylladb/releng is CI still non-functional, or is it fixed and I need to manually restart the CI? |
I restarted the CI. I hope it will work. |
Whereas DynamoDB limits the names of tables, LSIs and GSIs to 255
characters each, Alternator currently has different (and lower)
limitations:
1. A table name must be up to 222 characters.
2. For a GSI, the sum of the table's and GSI's name length, plus 1,
must be up to 222 characters.
3. For an LSI, the sum of the table's and LSI's name length, plus 2,
must be up to 222 characters.
These specific limitations were never documented, so in this patch we
add this information to docs/alternator/compatibility.md.
Moreover, these limitations where only partially tested, so in this patch
we add testing for more cases that we forgot to check - such as length
of LSI names (only GSI were checked before this patch), or adding a
GSI to an existing table. It is important to check all these corner
cases because there is a risk that if we attempt to create a table
without checking its length, we can end up with an I/O error that brings
down Scylla.
In one case - UpdateTable adding a GSI to an existing table - the new
test exposed a trivial bug: Because UpdateTable wants to verify the new
GSI doesn't have the same name as an existing LSI, it mistakenly applied
the LSI's length name limit instead of the GSI's name length limit,
which is one byte less than it should be. So this patch fixes this
trivial bug as well.
Signed-off-by: Nadav Har'El <[email protected]>
The two tests in this patch reproduce issue scylladb#24598: When enabling Alternator streams on an Alternator table with a very long name, such as the maximum allowed name length 222, the result is an I/O error and a Scylla shutdown. The two tests are currently marked "skip", otherwise they would crash the Scylla being tested. Refs scylladb#24598 Signed-off-by: Nadav Har'El <[email protected]>
|
CI was hung so rebased to get it running again. |
🟢 CI State: SUCCESS
Build Details:
|
|
@scylladb/scylla-maint please consider merging. This PR has 3 approving reviews, and apparently (I don't know if to believe it any more) it passed CI successfully an hour ago. |
Whereas DynamoDB limits the names of tables, LSIs and GSIs to 255 characters each, Alternator currently has different (and lower) limitations:
The first patch documents these existing limitations, improves their testing, and fixes a tiny bug found by one of the tests (where UpdateTable adding a GSI's limit testing is off by one).
The second patch unfortunately shows with a reproducer (issue #24598) this limit of 222 is problematic and we may need to lower it: If a user creates a table of length 222 and then enables Alternator streams, Scylla shuts down on an IO error. This will need to be fixed later, but at least this patch properly documents the existing behavior.
No need to backport this patch - it is a very minor improvement that it is unlikely users care about and there is no potential for harm.