Skip to content

Conversation

@fisx
Copy link
Contributor

@fisx fisx commented Jun 26, 2025

@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label Jun 26, 2025
@fisx fisx force-pushed the WPB-18443-robustness-against-rabbitmq-outage branch 4 times, most recently from a47dc6c to 449ea6e Compare July 1, 2025 08:35
@fisx fisx marked this pull request as ready for review July 1, 2025 09:38
@fisx fisx requested review from a team as code owners July 1, 2025 09:38
Copy link
Contributor Author

@fisx fisx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ready to merge to me!

@fisx fisx force-pushed the WPB-18443-robustness-against-rabbitmq-outage branch from 19f1884 to 6e6f888 Compare July 4, 2025 09:12
@fisx fisx force-pushed the WPB-18443-robustness-against-rabbitmq-outage branch from 0e7a062 to 573ace3 Compare July 7, 2025 15:39
-- Some times RabbitMQ still remembers connections from previous uses of the
-- dynamic backend. So we wait to ensure that we kill connection only for our
-- current.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this empty line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not :)

statuses = undefined
rabbitmqAdminClient = Just $ mockRabbitMqAdminClient mockAdmin
rabbitmqVHost = "test-vhost"
amqpEP = Q.AmqpEndpoint undefined undefined "test-vhost" undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the whole amqpEP be undefined (instead of having almost al fields of AmqpEndpoint being undefined)? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know, we did set rabbitmqVHost before, and i didn't see any reason to change that and deal with whatever happens?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, it's not really important. 😄

One could also ask if AppT isn't too big if we only need the vhost. Or, if getRemoteDomains :: RabbitMqAdmin.AdminAPI (Servant.AsClientT IO) -> Text {- vhosy -} -> AppT IO [Domain] wouldn't be sufficient. Hower, not our construction place right now 😉

-- Notification pusher thread
Log.info $ Log.msg (Log.val "Cancelling the notification pusher thread")
readIORef notifChanRef >>= traverse_ \chan -> do
Log.info $ Log.msg (Log.val "Got channel")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel's name might be interesting here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can only find channelID, and that is only exported from Network.API.Internal, which is not exposed in the cabal file. any hints where to find the name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that mess. Yeah, I remember we've tried that before ... 🙄
Well, then, we may leave this for now.

async $
liftIO $
openConnectionWithRetries env.logger rabbitmqOpts $
openConnectionWithRetries env.logger "BackendNotificationPusher" rabbitmqOpts $ -- can we find a bug here, too?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this? Shall we search a potential bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but not in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe transform this into a FUTUREWORK: ... or a ticket? 🤔

Copy link
Contributor

@supersven supersven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it's pretty hard be ensure that all this is free of deadlocks, because there are locks in many places.

I haven't found any so far. So, I'm approving. ✔️

@fisx fisx changed the title [WPB-18443] Robustness against rabbitmq outage [WPB-18631] Robustness against rabbitmq outage Jul 8, 2025
Copy link
Member

@akshaymankar akshaymankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if #4652 works.

@akshaymankar
Copy link
Member

Superseded by #4652

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants