Skip to content

Cleanup key tracking documentation and table management#8039

Merged
madolson merged 4 commits intoredis:unstablefrom
madolson:unstable-csc-cleanup
Dec 24, 2020
Merged

Cleanup key tracking documentation and table management#8039
madolson merged 4 commits intoredis:unstablefrom
madolson:unstable-csc-cleanup

Conversation

@madolson
Copy link
Contributor

@madolson madolson commented Nov 10, 2020

This change updates how we manage CSC tracking tables to improve memory efficiency and reduce unnecessary messages to clients.

The main change is we now always clear out the tracking table when flushdb is called instead instead of just for flushall. The original motivation here was back when there was a 16 million slot table, it was expensive to clean up so we didn't want to free it. However, there needed to be some way to reclaim the memory used by the table, so flushall was for that purpose. Now that it's less expensive to clear the table, we should always do it, since we are already notifying all the clients to free out their local caches.

The secondary change here is that we also free client tracking tables asynchronously when requested. The tracking table can get very large, and could block the main thread for several seconds otherwise.

@madolson madolson requested a review from soloestoy November 10, 2020 19:36
@oranagra
Copy link
Member

@madolson LGTM, but i don't trust myself in this area of the code base.
are you sure the TrackingTable is no longer needed when just one db was flushed (and others are kept)?
i suppose that's right since we sent a NULL invalidation message to all clients, but maybe it has other purposes (i just don't know that area and too busy to dig in now).

i find the desire to not pay the price of flushing the whole 16m slots back then on flushdb, and yet agreeing to pay it for flushall, odd.
i could understand that if you had to scan the data structure on flushdb you want to avoid it, but on flushall you can just release it without scanning.
can you clear that for me?

@madolson
Copy link
Contributor Author

AFAIK there is no other logic there, @soloestoy was the original contributor of this so maybe he has a better idea.

The old implementation was a 16 million item array of radix trees that pointed to clients. You couldn't simply just free the array, you needed to loop through it looking for radix trees, and free all those too. So even if there was just one key being tracked by one client, you had to pay a huge cost to do flushall.

A side node, we probably should free the radix tree async when flushall async is used, and I intend to do some testing on that to see if it's really that important.

@madolson
Copy link
Contributor Author

I guess another issue is someone might have been ignoring the flush message, and relying on the invalidations to eventually get sent.

@madolson madolson linked an issue Nov 10, 2020 that may be closed by this pull request
@hwware
Copy link
Contributor

hwware commented Nov 11, 2020

Hello @madolson @oranagra , I used to do some benchmark testing on this part and hopefully I can provide some hint to the optimization of this tracking cleanup performance issue.
What I did is doing this steps:

  1. start redis server with empty db
  2. using ./redis-benchmark -t get -r 100000000 -n 10000000000000000 --enable-tracking to populate the tracking table
  3. stop the benchmark and verify the tracking table being populated through info command.(through tracking_total_keys and tracking_total_items)
  4. doing flushall
    i noticed signficant delay in the 4th step if the tracking table is big. in my Mac for 1 million tracking keys I see 1.5s delay when i doing flushall eventhough the keyspace is empty, but if we start the same step with non --enable-tracking flag when we start the benchmarking program, the call replied immediatey and it is hardly notice any latencies..

Therefore, as @madolson mentioned, we may need to think of the way to free tracking table async when the tracking table grows big. I was thinking about the implementation also, maybe we should provide a hardcoded value for now, if the size of tracking table grows bigger than that, we can unlink the table and freeing it in the background thread. but we need to provide the bio interface for freeing the radix tree.. Or we encourge user limiting the tracking-table-max-keys configuration? How do you think on this? Or am I missing something here? Thanks!

@hwware
Copy link
Contributor

hwware commented Nov 11, 2020

Also I guess cleaning up tracking table and sending invalidation can be used when the configured max memory was reached?

@madolson
Copy link
Contributor Author

madolson commented Nov 11, 2020

Isn't it kind of weird that we are tracking key misses? I didn't realize we were doing that.

@madolson madolson force-pushed the unstable-csc-cleanup branch from 22c6d73 to fd1bc78 Compare November 24, 2020 04:43
@madolson
Copy link
Contributor Author

@hwware I updated the code to also free the tracking table async. I also did some refactoring of bio, since it might be my least favorite code in all of Redis.

@soloestoy Would still love your input about changing the behavior of flushdb also freeing the table.

@soloestoy
Copy link
Contributor

Hi @madolson I reread the history of tracking on flushdb and flushall, I'm sure free the TrackingTable when flushdb called is right and safe, and we do need an async way to free it.

About the bio refactoring, I think it's better to split this PR into two different PRs: one is the async freeing TrackingTable(with the lazyfree refactoring), the other one is the whole bio refactoring, it more clear I think.

@madolson madolson force-pushed the unstable-csc-cleanup branch from d964531 to cd2019f Compare December 7, 2020 20:56
@madolson
Copy link
Contributor Author

madolson commented Dec 7, 2020

@soloestoy Thanks! I was originally going to split it, but I also thought it's easier to review the BIO refactor when it's motivated by a new callback. The total change is pretty small, so I think it's probably easier to keep them together? If you strongly disagree, I'll split them.

@madolson madolson added the state:major-decision Requires core team consensus label Dec 10, 2020
@madolson
Copy link
Contributor Author

@redis/core-team Going to take that roughly as the code is okay, but it's still a major decision.

@oranagra
Copy link
Member

@madolson what changed since my last review of the code? just a rebase?
I see that back then the only thing bothered me is that i didn't understand the reasons why the old version flushed TrackingTable only when all databases are emptied and not just one.
I don't know that code well enough, so all i can say is that given that you and Zhao think that's a safe change, i'm ok with it (aka LGTM!)

@yossigo yossigo added the approval-needed Waiting for core team approval to be merged label Dec 10, 2020
@madolson
Copy link
Contributor Author

@oranagra The table is flushed async now, the behavior altering change is the same as before.

Copy link
Collaborator

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madolson LGTM, with one minor comment.

src/bio.c Outdated

struct lazy_free_job {
lazy_free_fn *fn; /* Function that will free the provided arguments */
void *args[1]; /* List of arguments to be passed to the free function */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider args[0] to make it more clear that it's dynamically allocated and it will also make the sizeof calculation more correct when allocating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner, but it throws a warning:
"warning: zero size arrays are an extension [-Wzero-length-array]". I suppose it's more portable this way. I think it's okay because we need at least one argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i you can use an empty array then, like we use here:

typedef struct clientReplyBlock {
    size_t size, used;
    char buf[];
} clientReplyBlock;

that seems the portable way to do it (C99).

Copy link
Contributor Author

@madolson madolson Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently that can't exist in nested structures? I moved it out of the struct, so now we'll use a couple extra bytes but it's cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For current bio use, it's ok, but i think it should be more generic, what should we do if we we add a new bio job, such as dump and restore keys.

@madolson madolson added release-notes indication that this issue needs to be mentioned in the release notes and removed state:major-decision Requires core team consensus labels Dec 23, 2020
@madolson madolson merged commit 59ff42c into redis:unstable Dec 24, 2020
linxiang-dev pushed a commit to linxiang-dev/redis that referenced this pull request Dec 27, 2020
Cleanup key tracking documentation, always cleanup the tracking table, and free the tracking table in an async manner when applicable.
@oranagra oranagra mentioned this pull request Jan 10, 2021
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Mar 2, 2021
Cleanup key tracking documentation, always cleanup the tracking table, and free the tracking table in an async manner when applicable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[QUESTION]Question regarding Client Side Caching

7 participants