Skip to content

test/cqlpy: add test for long table names#23229

Closed
nyh wants to merge 1 commit intoscylladb:masterfrom
nyh:test-4480
Closed

test/cqlpy: add test for long table names#23229
nyh wants to merge 1 commit intoscylladb:masterfrom
nyh:test-4480

Conversation

@nyh
Copy link
Copy Markdown
Contributor

@nyh nyh commented Mar 10, 2025

Scylla inherited a 48-character limit on the length of table (and keyspace) names from Cassandra 3. It turns out that Cassandra 4 and 5 unintentionally dropped this limit (see history lesson in CASSANDRA-20425), and now Cassandra accepts longer table names. Some Cassandra users are using such longer names and disappointed that Scylla doesn't allow them.

This patch includes tests for this feature. One test tries a 48-character table name - it passes on Scylla and all versions of Cassandra. A second test tries a 100-character table name - this one passes on Cassandra version 4 and above (but not on 3), and fails on Scylla so marked "xfail". A third test tries a 500-character table name. This one fails badly on Cassandra (see CASSANDRA-20389), but passes on Scylla today. This test is important because we need to be sure that it continues to pass on Scylla even after the Scylla is fixed to allow the 100-character test.

Refs #4480 - an issue we already have about supporting longer names

Note on the test implementation:
Ideally, the test for a particular table-name length shouldn't just
create the table - it should also make sure we can write table to it
and flush it, i.e., that sstables can get written correctly. But in
practice, these complications are not needed, because in modern Scylla
it is the directory name which contains the table's name, and the
individual sstable files do not contain the table's name. Just creating
the table already creates the long directory name, so that is the part
that needs to be tested. If we created this directory successfully,
later creating the short-named sstables inside it can't fail.

@nyh nyh assigned swasik and guy9 Mar 10, 2025
Comment thread test/cqlpy/test_name.py Outdated
yield
# The user's "with" code is running during the yield. If it didn't
# throw we return from the function - the raises_or_not() passed as
# the "or not" case.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is rises_or_not?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, wrong reference in the comment... I saw it and thought I fixed it... I'll fix it.

@nyh
Copy link
Copy Markdown
Contributor Author

nyh commented Mar 10, 2025

A note to reviewers (which I should probably have mentioned in the commit message):

Ideally, the test for a particular table-name length shouldn't just create the table - it should also make sure we can write table to it and flush it, so sstables get written correctly. But in practice, these complications are not needed, because in modern Scylla it is the directory name which contains the table's name, and the individual sstable files do not contain the table's name. Just creating the table already creates the long directory name, so that is the part that needs to be tested. If we created this directory successfully, later creating the short-named sstables inside it can't fail.

@nyh nyh added the backport/none Backport is not required label Mar 10, 2025
Scylla inherited a 48-character limit on the length of table (and
keyspace) names from Cassandra 3. It turns out that Cassandra 4 and
5 unintentionally dropped this limit (see history lesson in
CASSANDRA-20425), and now Cassandra accepts longer table names.
Some Cassandra users are using such longer names and disappointed
that Scylla doesn't allow them.

This patch includes tests for this feature. One test tries a
48-character table name - it passes on Scylla and all versions
of Cassandra. A second test tries a 100-character table name - this
one passes on Cassandra version 4 and above (but not on 3), and
fails on Scylla so marked "xfail". A third test tries a 500-character
table name. This one fails badly on Cassandra (see CASSANDRA-20389),
but passes on Scylla today. This test is important because we need to
be sure that it continues to pass on Scylla even after the Scylla is
fixed to allow the 100-character test.

Refs scylladb#4480 - an issue we already have about supporting longer names

Note on the test implementation:
Ideally, the test for a particular table-name length shouldn't just
create the table - it should also make sure we can write table to it
and flush it, i.e., that sstables can get written correctly. But in
practice, these complications are not needed, because in modern Scylla
it is the directory name which contains the table's name, and the
individual sstable files do not contain the table's name. Just creating
the table already creates the long directory name, so that is the part
that needs to be tested. If we created this directory successfully,
later creating the short-named sstables inside it can't fail.

Signed-off-by: Nadav Har'El <[email protected]>
@nyh
Copy link
Copy Markdown
Contributor Author

nyh commented Mar 10, 2025

Pushed a new version with fixed comment and an implementation note copied from the "note to reviewers" above.

@scylladb-promoter
Copy link
Copy Markdown
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests

Build Details:

  • Duration: 8 hr 3 min
  • Builder: spider4.cloudius-systems.com

Comment thread test/cqlpy/test_name.py
from cassandra.protocol import InvalidRequest
from .util import unique_name

# passes_or_raises() is similar to pytest.raises(), except that while raises()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have some separate file to store our own extensions to pytest? So that we have all of them in a single place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have test/cqlpy/util.py.
My philosophy is that we should only move things there if they have multiple users. We should not pretend that every "nice" used-once function should be a library function. If we do, the result will be a library, not a test suite. And it will be a bad library. What often happens (see dtest as an example) is that:

  1. One person creates a "convenience function" everyone will want and puts it in a central file.
  2. The second person can't use this function because it doesn't do exactly what he needs, so creates another function and puts it in a central file.
  3. A third person wants to use these functions, but neither does exactly what he needs, so he adds an options to the first function.
  4. ...
  5. A few years later, you have 3 different functions doing similar functions, each of them have 7 different options to configure them in just the right way.
  6. When someone wants to read tests, it's impossible to understand what you're seeing. Instead of seeing 5 lines of understandable CQL or DynamoDB calls, you see a single function call with 7 different parameters and have no idea what is actually happening.

So I want to move this into util.py is we see we need exactly the same code in multiple places, and only then.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, passes_or_raises() is definitely a good candidate to being promoted to a library. I also used the same code in test/alternator, so it's already two different users (although in two separate test suites), and I tried to write it in a pretty-general way (I think) that isn't very specific to one particular use case.

But it's a slippery slope - should new_named_table() also be moved into a library? What about padded_name()? I vote no, until additional tests would like to use such a utility. At this point we have 1700 test functions in cqlpy, and this was the first time I needed new_named_table() :-) Actually, we already had new_named_table() in test/alternator, but the implementation is completely different (it creates an Alternator table, vs. a CQL table) so it can't be shared anyway.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more, an interesting feature of passes_or_raises(), which is different from the other functions I mentioned in my above rant, is that it is a "pure" pytest function - it does not use CQL, Alternator API, async io, or any of the specifics of any of our test suites. So we could have something like pylib/pytest.py which will contain pytest-only code that both test/cqlpy and test/alternator (and all other test suites) can share.

While I can do that, I'm worried this pylib/pytest.py will become a kitchen sink (in the good case) or garbage can (in the worst case) of dozens of random utility functions, and am really hesitant about getting the ball rolling in that direction.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if it does not have maintainer who makes sure that it does not start to be messy then probably it is fine to keep this function local.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/none Backport is not required promoted-to-master

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants