test/alternator: clean up write isolation default and add more tests for the different modes#24493
Conversation
🔴 CI State: FAILURE✅ - Framework test Failed Tests (1/97479):Build Details:
|
ScyllaPiotr
left a comment
There was a problem hiding this comment.
Wow, very nice suite of tests!
I only raised a couple of concerns, I wonder if you will share them.
| cmd += [ | ||
| '--alternator-address', ip, | ||
| '--alternator-enforce-authorization', '1', | ||
| '--alternator-write-isolation', 'always_use_lwt', |
There was a problem hiding this comment.
We ultimately never finished the discussion in the issue thread. With this change, you decide to move all the tests to default from fully consistent to eventually consistent (right?) and therefore susceptible to data inconsistencies.
There was a problem hiding this comment.
It's not exactly "fully consistent vs eventually consistent". I tried to explain the differences between these modes in the documentation, but to make a long story short "always_use_lwt" uses LWT for every write and guarantees that any concurrent writes are isolated - even if you mix RMW and pure writes (e.g., consider i=3 and doing "i=7" and "i++" in parallel. With correct isolation you end up with either i=7 or i=8. With incorrect isolation you can end up with i=4.). The mode "only_rmw_use_lwt" only isolates RMW, and assumes you know you don't mix pure writes and RMW to the same item - which is an assumption that DynamoDB doesn't make, but some users know is true.
Anyway, for the tests it doesn't really matter what mode to use. As I noted as an example, test_condition_expression.py has almost 2,000 lines of code, 53 test functions, to test a large number of syntax features and corner cases of condition expression. None of these tests care how we do write isolation. We just need one test (that I cleaned up in this patch) to check how condition expressions - any condition expressions - work with the different write isolation modes, and all the 53 other tests - they can run on any mode we choose. We don't need to choose the mode after very deliberate thinking.
There was a problem hiding this comment.
It's not exactly "fully consistent vs eventually consistent".
That's clear now, thanks for the good example.
Anyway, for the tests it doesn't really matter what mode to use.
Well, the issue cover letter asks which isolation mode should we choose, so it must be relevant, at least - between the test suites (test.py vs test/alternator/run).
| # | ||
| # SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0 | ||
|
|
||
| # Tests for the the operations allowed under the four different write isolation |
| # are marked scylla_only, so all tests that use one of them are skipped | ||
| # when not running against Scylla. | ||
| @pytest.fixture(scope='module') | ||
| def test_table_s_forbid(dynamodb, scylla_only): |
There was a problem hiding this comment.
I am generally a fan of elaborate names. I propose to use the following suffixes for the fuxtures:
forbid_rmwalways_lwtonly_rmw_lwtunsafe_rmw
I find the current names ambiguous.
There was a problem hiding this comment.
Perhaps. I originally decided on these shorter names because:
- These names are used dozens of times throughout the test code, e.g.,
test_table_s_forbid.put_item(Item={'p': p, 'a': 2},, and the longer they are the harder it is to read those lines - I think. - Alternator already supports any subset of these names, so instead of "forbid_rmw" you can use "forbid" or even "f" and this is even documented.
There was a problem hiding this comment.
Ad. 1 Well, new devs obviously have less context to base upon and for them longer (more verbose) names are easier to read (understand).
Ad 2. Well I'm not sure it was a good idea to allow these abbreviations, but it's pointless to exchange arguments about it now. However, the way one uses it now can be argued and a lone forbid is ambiguous, while f for me is blatantly misleading - for me it associates with force.
But it's obviously only an opinion.
There was a problem hiding this comment.
Ok, I'll change in this test to the long names (I won't change the API). I want this PR to go in :-)
There was a problem hiding this comment.
While at it, I'll remove the word "test" from the table fixture. While it made semantic sense when I called the original table in these tests "test_table" (it's a test table, after all!), it's really confusing for seasoned pytest developers who are used to test_something being the name of a test, not a fixture. I don't want to overhaul the entire test suite to rename test_table, but here we create four new tables, so I can give them a better name like table_s_forbid_rmw.
Or even shorter - just table_forbid_rmw. It's not like I have in this file tables with a different schema.
| # the forbid mode. | ||
|
|
||
| # PutItem & ConditionExpression: | ||
| def test_isolation_putitem_conditionexpression(test_table_s_forbid, test_table_s_permit): |
There was a problem hiding this comment.
After this test, the data stays in the tables? Even more incentive to merge this test with the deleteitem one, that I write about further down.
There was a problem hiding this comment.
Yes, all throughout the Alternator test suite, tests share the same tables, and add data to a unique key, and often this data just stays in the table which doesn't matter at all - and is probably even faster than bothering to delete the items with additional comments. Instead, the entire is delete in one fell swoop when the test ends (for the tables like test_table_s shared by all tests) or in the case of the tables used only in this test file, in a "module"-scoped fixture, the table is deleted at the end of the file.
So there is no penalty for leaving a live item.
You're right that there's a small penalty when two tests need to create the same item (and each test creates the item in a different key) but this penality is absolutely tiny - less than a millisecond, and not worth optimizing from.
There was a problem hiding this comment.
I had thought about an influence on correctness of the tests, assuming the tests are run (at least statistically) in a specific order.
There was a problem hiding this comment.
That's why Alternator tests are all written to assume they do not work on a brand new table, so should work on a unique key (currently "random", if you're super-super-super unlucky it might not be unique, see #9988) and not a fixed key like "x".
A minority of the tests that do need very unique or brand-new tables can create a new table in the test, but most don't. Depending on your disk setup (/tmp or current directory), this testing philosophy can save 0.01 to 0.1 seconds on each test, which adds up when you have a thousand tests.
| ExpressionAttributeValues={':oldval': 1}) | ||
|
|
||
| # DeleteItem & ConditionExpression: | ||
| def test_isolation_deleteitem_conditionexpression(test_table_s_forbid, test_table_s_permit): |
There was a problem hiding this comment.
Perhaps it would be beneficial (execution time-wise) to merge this test with the putitem one. There are many similar steps in the two tests.
There was a problem hiding this comment.
As I noted above perhaps we can save a couple of milliseconds this way, but I don't think it's worth it.
| assert ret['Attributes'] == {'p': p, 'a': 1} | ||
| assert 'Item' not in table.get_item(Key={'p': p}, ConsistentRead=True) | ||
| with pytest.raises(ClientError, match=rmw_forbidden): | ||
| test_table_s_forbid.delete_item(Key={'p': p}, ReturnValues='ALL_OLD') |
There was a problem hiding this comment.
You test with different data in the tables inside test_table_s_permit vs the test_table_s_forbid. It's not a problem, but an actual difference in the test conditions.
| AttributeUpdates={'a': {'Value': 1, 'Action': 'ADD'}}) | ||
| assert table.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 2} | ||
| with pytest.raises(ClientError, match=rmw_forbidden): | ||
| test_table_s_forbid.update_item(Key={'p': p}, |
There was a problem hiding this comment.
You test with different data in the tables inside test_table_s_permit vs the test_table_s_forbid. It's not a problem, but an actual difference in the test conditions.
| Expected={'a': {'ComparisonOperator': 'EQ', 'AttributeValueList': [1]}}) | ||
| assert 'Item' not in table.get_item(Key={'p': p}, ConsistentRead=True) | ||
| with pytest.raises(ClientError, match=rmw_forbidden): | ||
| test_table_s_forbid.delete_item(Key={'p': p}, |
There was a problem hiding this comment.
You test delete_item with different data in the tables inside test_table_s_permit vs the test_table_s_forbid. It's not a problem, but an actual difference in the test conditions.
There was a problem hiding this comment.
Yes, I answered this above as well. If you feel this is a problem I can change both^H^H^H^Hall four places.
| Expected={'a': {'ComparisonOperator': 'EQ', 'AttributeValueList': [1]}}) | ||
| assert table.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 2} | ||
| with pytest.raises(ClientError, match=rmw_forbidden): | ||
| test_table_s_forbid.update_item(Key={'p': p}, |
There was a problem hiding this comment.
You test update_item with different data in the tables inside test_table_s_permit vs the test_table_s_forbid. It's not a problem, but an actual difference in the test conditions.
| Expected={'a': {'ComparisonOperator': 'EQ', 'AttributeValueList': [1]}}) | ||
| assert table.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 2} | ||
| with pytest.raises(ClientError, match=rmw_forbidden): | ||
| test_table_s_forbid.put_item(Item={'p': p, 'a': 2}, |
There was a problem hiding this comment.
You test put_item with different data in the tables inside test_table_s_permit vs the test_table_s_forbid. It's not a problem, but an actual difference in the test conditions.
🔴 CI State: FAILURE✅ - Framework test Failed Tests (1/97657):Build Details:
|
Known flakiness - this is #24513 - a recent regression that Avi had to revert. I'll rebase to get the newer master with the revert. |
🟢 CI State: SUCCESS✅ - Framework test Build Details:
|
Before this patch, we had in test_condition_expression.py and test_update_expression.py some rudimentary tests that the different write isolation modes behave as expected. Basically, we wanted to test that read-modify-write (RMW) operations are recognized and forbidden in forbid_rmw mode, but work correctly in the three other modes. We only check non-concurrent writes, so the actual write isolation is NOT checked, just the correctness of non-concurrent writes. However, since these tests were split across several files, and many of the tests just ran other existing tests in different write isolation modes, it was hard to see what exactly was being tested, and what was missed. And indeed we missed checking some RMW operations, such as requests with ReturnValues, requests with the older Expected or AttributeUpdates (only the newer ConditionExpression and UpdateExpression were tested), and ADD and DELETE operations in UpdateExpression. So this patch replaces the existing partial tests with a new test file test_write_isolation.py dedicated to testing all kinds of RMW operations in one place, and how they don't work in forbid_rmw and do work in the other modes. Writing all these tests in one place made it easier to create a really exhaustive test of all the different operations and optional parameters, and conversely - make sure that we don't test *unnecessary* things such as different ConditionExpression expressions (we already have 1800 lines of tests for ConditionExpression, and the actual content of the condition is unrelated to write isolation modes). Signed-off-by: Nadav Har'El <[email protected]>
Originally (since commit c3da9f2), Alternator's functional test suite (test/alternator) ran "always_use_lwt" write isolation mode. The original thinking was that we need to exercise this more difficult mode and it's the most important mode. This mode was originally chosen in test/alternator/run. However, starting with commit 76a766c (a year ago), test.py no longer runs test/alternator/run. Instead, it runs Scylla itself, and the options for running Scylla appear in test/alternator/suite.yaml, and accidentally the write isolation mode only_rmw_uses_lwt was chosen there. The purpose of this patch is to reconcile this difference and use the same mode in test.py (which CI is using) and test/alternator/run (which is only used by some developers, during development). I decided to have this patch change test/alternator/run to use only_rmw_uses_lwt. As noted above, this is anyway how all Alternator tests have been running in CI in the past year (through test.py). Also, the mode only_rmw_uses_lwt makes running the Alternator test suite slightly faster (52 seconds instead of 58 seconds, on my laptop) which is always nice for developers. This patch changes nothing for testing in CI - only manual runs through test/alternator/run are affected. Signed-off-by: Nadav Har'El <[email protected]>
|
Pushed a new version implementing most of @ScyllaPiotr 's suggestions about renaming the test tables etc, but didn't act on a bunch of places where you told me the test for "forbid" and "permit" modes are not identical (I know, and I think it's fine. the intention was to test all four modes - it's not necessary that the tests will be 100% identical). @ScyllaPiotr you left "request changes" so please re-review. |
🟢 CI State: SUCCESS
Build Details:
|
|
@scylladb/scylla-maint please consider merging: There is one reviewer approval and the CI passed. |
ping |
In #24442 it was noticed that accidentally, for a year now, test.py and CI were running the Alternator functional tests (test/alternator) using one write isolation mode (
only_rmw_uses_lwt) while the manual test/alternator/run used a different write isolation mode (always_use_lwt). There is no good reason for this discrepancy, so in the second patch of this 2-patch series we change test/alternator/run to use the write isolation mode that we've had in CI for the last year.But then, discussion on #24442 started: Instead of picking one mode or the other, don't we need test both modes? In fact, all four modes?
The honest answer is that running all tests with all combinations of options is not practical - we'll find ourselves with an exponentially growing number of tests. What we really need to do is to run most tests that have nothing to do with write isolation modes on just one arbitrary write isolation mode like we're doing today. For example, numerous tests for the finer details of the ConditionExpression syntax will run on one mode. But then, have a separate test that verifies that one representative example of ConditionExpression (for example) works correctly on all four write isolation modes - rejected in forbid_rmw mode, allowed and behaves as expected on the other three. We had some tests like that in our test suite already, but the first patch in this series adds many more, making the test much more exhaustive and making it easier to review that we're really testing all four write isolation modes in every scenario that matters.
Fixes #24442
No need to backport this patch - it's just adding more tests and changing developer-only test behavior.