Skip to content

Redis cli resp3 push support#7609

Merged
yossigo merged 4 commits intoredis:unstablefrom
michael-grunder:redis-cli-resp3-push
Aug 9, 2020
Merged

Redis cli resp3 push support#7609
yossigo merged 4 commits intoredis:unstablefrom
michael-grunder:redis-cli-resp3-push

Conversation

@michael-grunder
Copy link
Contributor

Bullet points:

  • Bump hiredis to v1.0.0
  • Adds support for RESP3 push messages in redis-cli (e.g. SUBSCRIBE/client tracking messages)
  • Adds an option --print-push-msgs to display the messages (maybe this should be the default for invalidations?)

I'm happy to rework how invalidations are displayed but it seemed reasonable to use a special format (like redirections) for that when in normal mode, but output in csv/raw when requested.

Here's how it looks:

127.0.0.1:6379> client tracking on
OK
127.0.0.1:6379> get k1
"v123"
127.0.0.1:6379> get k2
"v999"
127.0.0.1:6379> set k1 newvalue
-> invalidate: 'k1'
OK
127.0.0.1:6379> set k2 newvalue
-> invalidate: 'k2'
OK

@michael-grunder
Copy link
Contributor Author

@yossigo Even if we hold off on this PR you may want to commit this tiny fix

@oranagra
Copy link
Member

oranagra commented Aug 5, 2020

@michael-grunder first of all, thanks!

i didn't look at the code yet, but i have one concern about the update of hiredis in deps.
i don't know what was the process in the past to update it, and i see you did create a new bash script for the job, but i wonder if it is missing anything.
maybe one way to tell is attempt to run it on the existing version that's already in redis and look at the differences it creates.
did you happen to do that, or anything else to be sure that's all there is to it?

@oranagra
Copy link
Member

oranagra commented Aug 5, 2020

redis-cli code looks good to me. certainly enough as a starting point for future improvements.

i think push messages should be displayed by default. considering this tool is mainly for human eyes, not scripts. and considering they'll most likely be absent for any existing scripts that use redis-cli but are probably not using pubsub / invalidations.
anyone sees any reason not to make it default?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-grunder @oranagra perhaps we should consider using git subtree to manage this dependency?

src/redis-cli.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-grunder I think printing this by default makes sense for TTY mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll make the change. I suppose what I'll do is change the option to something like --push-replies and allow for always, tty, and never (with the default being tty). That way there's a fair amount of control with a sane default.

@michael-grunder
Copy link
Contributor Author

michael-grunder commented Aug 6, 2020

OK reworked to display PUSH messages by default if we detect STDOUT is a tty and changed the configuration argument to a boolean --show-pushes <yn> if a user wants to override.

I'll switch to using a subtree for the hiredis dependency if everyone is good with that.

Edit: I rebased the changes with hiredis as a subtree in a second branch on my fork. Should make it easy to pick the subtree or non-subtree upgrade 😄

@yossigo
Copy link
Collaborator

yossigo commented Aug 7, 2020

@michael-grunder I see the subtree changes are already on this PR. I have done some reading and it looks like subtrees are really a good fit to handle the deps and deps updates, but I haven't had the chance to experience with them. Do you have any take on this?

@oranagra any thoughts? I lean towards adopting this for hiredis, and if we're happy with it expand to other deps.

@michael-grunder
Copy link
Contributor Author

Ahh, I was going to hold off pushing the subtree commits in this PR branch until we decided which was preferable but then I noticed the fix I needed to do in 890ff61.

I'm happy either way. I haven't really used subtree much but it does seem like a nice way to manage deps

@oranagra
Copy link
Member

oranagra commented Aug 7, 2020

I don't mind using git subtree, IIUC with the exception of the commit comments, the result is the same as just coping the files. (am I wrong?)
what worries me is that maybe we had some local changes in our previous hiredis folder.
either some bugfixes or adjustments that didn't make it to the upstream for some reason, or maybe something as simple as a gitignore file.
I would like someone to try this procedure on the previous version we used, to make sure it doesn't bring any changes / deletions that we want to retain.

@michael-grunder
Copy link
Contributor Author

michael-grunder commented Aug 7, 2020

It looks like hiredis was last updated in Redis in b087dd1, which appears to be hiredis 01535274

As far as I can tell, they're identical:

Super quick and dirty bash

git clone https://github.com/redis/redis.git
cd redis && git checkout b087dd1db60ed23d9e59304deb0b1599437f6e23 && cd deps

git clone https://github.com/redis/hiredis hiredis-upstream && \
    cd hiredis-upstream && git checkout 01535274441bf235dc17ace1343515755a6049ed

cd .. && diff hiredis hiredis-upstream

My diff just shows .git, etc

mike@cthuhlu:/tmp/hrcompare/redis/deps/hiredis-upstream$ cd .. && diff hiredis hiredis-upstream
Common subdirectories: hiredis/adapters and hiredis-upstream/adapters
Common subdirectories: hiredis/examples and hiredis-upstream/examples
Only in hiredis-upstream: .git

Edit: I suppose we could choose to use the new allocator replacement feature of hireds v1.0.0 but I don't think it's really needed and I always prefer caution.

Edit2: There is another relevant change that actually went from Redis 2647084, 5e399d5 to hiredis 5c9f49e2

TL;DR: As far as I can tell there isn't any specific hiredis logic that differs between the two repos. That being said, I'm not in any rush; Better to be careful with such a big merge. Going forward coordination should be simpler.

@oranagra
Copy link
Member

oranagra commented Aug 7, 2020

looks to me like hiredis was updated after the commit you said (b087dd1) in: c389972.
and after that there was the change in 5e399d5, which you say is already present in 1.0, right?

didn't understand what 2647084 has to do with it?

@michael-grunder
Copy link
Contributor Author

Sorry I was running out the door.

I'll go through the whole diff between hiredis/1.0.0 and Redis/unstable and see if there's anything there that I don't recognize.

@yossigo
Copy link
Collaborator

yossigo commented Aug 8, 2020

Had a quick look at this, seems good to me. Because commits were not done consistently to redis/hiredis, the current redis unstable version doesn't correspond to any single hiredis version. Comparing the tree, it turns it resembles d952ed877ab1e515fcf5d16ec69576b87be61796 most and everything in it was later merged.

@michael-grunder I will be happy to get your OK as well.

yossigo
yossigo previously approved these changes Aug 8, 2020
@michael-grunder
Copy link
Contributor Author

michael-grunder commented Aug 8, 2020

@yossigo That looks right to me, thanks for taking a look.

It looks like we're still using unsigned int to represent redisReply->elements in a couple places in redis-cli. I can update that to use size_t if you guys want but I'm not sure it's really needed.

Add logic to redis-cli to display RESP3 PUSH messages when we detect
STDOUT is a tty, with an optional command-line argument to override
the default behavior.

The new argument:  --show-pushes <yn>

Examples:

$ redis-cli -3 --show-pushes no
$ echo "client tracking on\nget k1\nset k1 v1"| redis-cli -3 --show-pushes y
@yossigo yossigo merged commit 3f073b1 into redis:unstable Aug 9, 2020
@michael-grunder michael-grunder deleted the redis-cli-resp3-push branch August 9, 2020 19:07
@oranagra
Copy link
Member

looks like the merge broke the sentinel tests (although not consistently):
when i attempted to reproduce it (to prove which commit it was) i got :

./runtest-sentinel
...
Testing unit: 00-base.tcl
...
09:11:02> ODOWN is not possible without N (quorum) Sentinels reports: OK
09:11:02> Failover is not possible without majority agreement: redis-sentinel: async.c:380: __redisAsyncDisconnect: Assertion `ret == REDIS_ERR' failed.
WARNING: Aborting the test.

@michael-grunder can you look into it?

p.s. while bisecting i noticed that the Makefile fails to detect the change.
i.e. if we build the commit before the upgrade, and then checkout the latest and try to build, it doesn't rebuild hiredis, and we get: undefined reference to redisSetPushCallback
not sure if we wanna solve that, or have people do a full rebuild.

@michael-grunder
Copy link
Contributor Author

Sure, I'll take a look.

@michael-grunder
Copy link
Contributor Author

michael-grunder commented Aug 10, 2020

This was a fun one to track down.

The offending commit appears to be this sdsrange overflow fix.

Specifically, it's the sdsrange update to return an int:

-void sdsrange(sds s, int start, int end) {
+int sdsrange(sds s, ssize_t start, ssize_t end) {

Redis' sdsrange has this prototype:

void sdsrange(sds s, ssize_t start, ssize_t end);

What's happening is that hiredis ends up executing Redis' copy of sdsrange and because there is an ABI mismatch, chaos ensues.

I can fix the problem by wrapping my calls to sdsrange in hiredis (we only call it in two places), but I wonder if it's maybe smart for me to rework Hiredis' copy of sds to use unique symbol names so this doesn't bite us in the future.

Link to my branch with a quick and dirty fix. I'm happy to implement and formalize a proper fix once we determine how best to tackle the issue.

@oranagra
Copy link
Member

We had this in the past (when i changed the sds header to support various length classes).
i suppose that back then, the solution i took was to update the sds library inside hiredis to match the one that's inside redis.

So in this case, maybe the quick fix is actually to mirror this change to redis (seems that it should be safe), but that still leaves the possibility of this biting us again in the future.

Taking the other alternative you suggested (which is to rename all the sds interfaces would mess up tons of source lines and the blame log, so i wish we can find another safe way.

p.s. we also faced similar problem in RL with other tools that use both hiredis and (lib)redis, the solution there is to build a dynamic library using -Bsymbolic and -fvisibility=hidden or something of that sort.

@yossigo do you have any other ideas or thoughts?

@yossigo
Copy link
Collaborator

yossigo commented Aug 11, 2020

Because hiredis has really minimal use of sds, I think wrapping the symbols makes most sense to me. @oranagra why do you think it will be that much of a change?

@oranagra
Copy link
Member

@yossigo what do you mean by wrapping the symbols? same thing that's already one in the quick and dirty fix?

I don't know the hiredis code that well, i was referring to replacing all symbol names from sdsrange to hr_sdsrange or alike. so i imagine it would mess up hundreds of lines (besides the hundreds of lines inside sds.c/h)

@yossigo
Copy link
Collaborator

yossigo commented Aug 11, 2020

@oranagra yes. As @michael-grunder indicated it's not used that extensively.

Being a static library (and IMHO it should remain so), we don't really have other options. Even if we pull the specific prototype change into Redis, fundamentally the problem is still there and may arise again in the future with other unexpected changes.

@michael-grunder
Copy link
Contributor Author

I don't mind doing it but just so everyone is on the same page.

Are we settled on updating all sds* functions in hiredis to something like hi_sds*? This would be many changed lines but it wouldn't take very long or be very complex.

Another possibility might be to detect when hiredis is being built within Redis' build tree and use/include Redis' copy of sds. That option would take more doing though.

@yossigo
Copy link
Collaborator

yossigo commented Aug 11, 2020

@michael-grunder I think trying to detect the fact that hiredis is part of Redis may not be good enough, because such collisions could surface indirectly (e.g. a module that uses hiredis and gets loaded into Redis).

The simplest and safest option would be to use private symbols for the embedded sds to avoid any risk of collisions. It's a ugly hack but the safest. Another option is to create an "amalgamated" static-inlined version of sds, which will eliminate the impact on the rest of the hiredis code base.

@yossigo
Copy link
Collaborator

yossigo commented Aug 11, 2020

A quick attempt at the amalgamation option:
yossigo/hiredis@379910b

Used static inline for everything just to be absolutely sure no symbols are created and everything is resolved locally. Not sure what side effects this may introduce, but I do see it's not perfect and variadic functions do end up having their symbols:

$ nm libhiredis.a |grep sds
0000000000000a50 t sdscatfmt
0000000000000000 t sdscatfmt.cold
0000000000000570 t sdscatvprintf

@michael-grunder
Copy link
Contributor Author

I implemented the changes we discussed above and pushed them to a branch on my fork of Redis

As you can see I renamed sds to hisds and every function from sds* to hi_sds*. Additionally I created a compatibility header, that I include in redis-cli.c and redis-benchmark.c so I don't have to pollute the commit log with hundreds of lines of name changes.

I ran tests both with and without valgrind and everything seems good. I don't always see success running the sentinel tests locally but they fail the same way in this branch and from a commit before any of these changes were merged.

I'll open a PR if this is the way we want to solve the issue.

JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Aug 13, 2020
@oranagra
Copy link
Member

Ohh, seeing the comment in redis-cli:

+#include <sds.h> /* use sds.h from hiredis, so that only one set of sds functions will be present in the binary */

i now realize that i attempted to resolve this problem permanently in f15df8b (rather than copy the changes i made to sds into hiredis as i said a 2 days ago).

So the problem is only with Sentinel.
But maybe now that we solved it there we should change that include mentioned above to use " rather than < (and drop the comment). Add sds.o to the Makefile for these two binaries and drop the use of the new sdscompat.h file?
Seems better and less confusing that all source files in redis/src are using the sds implementation of redis.

p.s. maybe you wanna use it in hiredis to avoid the mass renaming you did there. (though that would mess up readability and also the 'go to definition' of IDEs.

@michael-grunder
Copy link
Contributor Author

Works for me.

I need to take a closer look but I should be able to just upgrade hiredis to use Redis' sds and then wrap sdsrange as I did in the first fix.

I'm not sure whether I can release that in Hiredis before a major version, but it should be acceptable to use it for the one Redis bundles.

I'll put it together in the morning.

@oranagra
Copy link
Member

@michael-grunder i lost you. why do you need to update hiredis or wrap sdsrange?
my suggestion is similar to what you already did on your branch, and it doesn't change anything in Sentinel.
the only difference is that it makes redis-cli and redis-benchmark use the same approach that Sentinel now uses, i.e. use the sds library from redis. and it allows you to drop the sdscompat.h file.

or where you referring to my note that said:

p.s. maybe you wanna use it in hiredis to avoid the mass renaming you did there. (though that would mess up readability and also the 'go to definition' of IDEs.

@michael-grunder
Copy link
Contributor Author

michael-grunder commented Aug 13, 2020

No, I misunderstood what you were suggesting.

I thought you were suggesting simply changing #include <sds.h> to #include "sds.h" in redis-cli and redis-benchmark would solve the problem, which it wouldn't without renaming Hiredis' sds symbols.

Here's a branch with your suggestion

It's not a solution though, because because redis-cli.c directly interacts with redisContext->reader->obuf which is just begging for a segfault (and in fact does segfault when running the redis-cli tests in that branch 😄).

Edit: I may be able to fix this by overriding Hiredis' allocators to use zmalloc, etc.

Edit2: Overriding hiredis' allocators in redis-cli gets all the tests passing. This is a bit dangerous because Redis is directly interacting with sds objects that Hiredis uses, but there isn't currently any mismatch in memory layout so it works fine.

@oranagra
Copy link
Member

I didn't consider that redis-cli actually calls it's sds functions on pointers allocated by hiredis's sds. that's too dangerous (even if currently working). i guess we should revert to your previous solution, the one in this branch.

So since redis-cli actually messes with the sds pointers, it'll have just one sds implementation in it (the hiredis one).
Sentinel has two sds implementations in it with different symbols, hopefully it doesn't mix them up.
And redis-benchmark uses the redis-cli approach, although i'm not sure if it interacts with the pointers in a similar way.

Is there anything stopping us from using the redis-cli approach in sentinel too? (would possibly be safer).

Since we probably explored all the options, it's probably time to make a PR.

@michael-grunder
Copy link
Contributor Author

I feel bad this was such a painful update.

I'm happy to open a PR with this solution as you suggested, but I wonder if there's actually a cleaner option.

If you look at the modifications to the SDS library that we've done in Hiredis you'll see that most of them have to do with checking for NULL returns from allocations. This is so we can gracefully handle OOM. That's not required in Redis since (unless this has changed) Redis just SIGABORTs on OOM.

The sdsrange overflow fix seems low priority since it's not currently in Redis and has never caused a problem. Even if it did occur, it's only likey to cause issues for redis-cli or redis-benchmark (non critical).

TL;DR: What about just patching hiredis to use Redis' current SDS for the one bundled with Redis?

@oranagra
Copy link
Member

I don't mind improving the sds code in Redis to handle failed allocation (although as you said, it's dead code in Redis), but doing that, same as forcing the sds code in hiredis to be identical to redis', is just hiding the problem.
I rather resolve it once and for all.
Till your last discovery, I thought that just avoiding symbol clash is enough, but now that we see that they might also work on each other's pointers, I feel that it's better to just make sure there's just one sds implementation in each process. I already did that years ago for redis-cli and redis-benchmark, I hope doing the same in Sentinel would resolve it for good.

@oranagra
Copy link
Member

Btw, i suppose that if we take that path (only one set of symbols in each process), maybe we can abandon the mass rename and the compatibility header you created?

@michael-grunder
Copy link
Contributor Author

I can avoid the mass renaming, and use Hiredis' SDS library in sentinel but that will require me to match versions (e.g. hiredis v1.0.0 doesn't have SDS_NOINIT).

It does work though.

Also since I maintain hiredis I can make sure this mismatch isn't a problem in the future.

Let me know what you prefer and I'll open a PR (and then we can forget about this whole ordeal 😄)

@oranagra
Copy link
Member

Ohh, silly me. For a moment I forgot that sentinel is actually redis.
Sorry to have wasted your time, I don't think it's wise to let that code base which is usually tested with redis' sds, to run with hiredis' sds instead.
So that leaves us with just hoping that unlike redis-cli, sentinel won't mix pointers between these two implementations.
So I'd vote that we go with your initial fix, the the cli and benchmark will have just one sds implementation, using the compact header, and sentinel will have two with different symbol names.

@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Dec 9, 2020
@oranagra oranagra mentioned this pull request Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants