-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add support for listening to Azure Maintenance Events #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hey there! Thanks for getting this in :) I'll try and get eyes on this week, but wanted to scan quick and add some top level thoughts:
|
|
@NickCraver Thanks so much for the quick reply! I have a couple questions/comments:
Working on addressing the other feedback now :) |
|
Oooh gotcha - alrighty maybe a mismatch on comms here between me and @philon-msft - let me clarify and see if we agree. On 1: I'm of the opinion we can go ahead and opt everyone into this in the current package if we think we're connecting to Azure. We could make the method for doing so explicitly public so that any non-detectable/custom endpoints are easily opted in for anyone in that situation. I don't think we have to wait for a consumer package move. It won't create another connection than it does today because we already light up the subscriber connection unconditionally on the ServerEndPoint - ultimately you're looking here - called when we first connect. On 2: I took a stab at illustrating this and...meh, it's kind of a wash. Here's what I ended up with: /// <summary>
/// Azure node maintenance event details.
/// </summary>
public class AzureMaintenanceEvent
{
public AzureMaintenanceEvent(ReadOnlySpan<char> message)
{
try
{
while (message.Length > 0)
{
if (message[0] == '|')
{
message = message.Slice(1);
continue;
}
// Grab the next pair
var key = message.Slice(0, message.IndexOf('|'));
message = message.Slice(key.Length + 1);
var valueEnd = message.IndexOf('|');
var value = valueEnd > -1 ? message.Slice(0, valueEnd) : message;
message = message.Slice(value.Length);
if (key.Length > 0 && value.Length > 0)
{
switch (key)
{
case var _ when key.SequenceEqual(nameof(NotificationType)):
NotificationType = value.ToString();
break;
case var _ when key.SequenceEqual(nameof(StartTimeUtc)) && DateTime.TryParse(value, out DateTime startTime):
StartTimeUtc = DateTime.SpecifyKind(startTime, DateTimeKind.Utc);
break;
case var _ when key.SequenceEqual(nameof(IsReplica)) && bool.TryParse(value, out var isReplica):
IsReplica = isReplica;
break;
case var _ when key.SequenceEqual(nameof(IPAddress)) && IPAddress.TryParse(value, out var ipAddress):
IpAddress = ipAddress;
break;
case var _ when key.SequenceEqual(nameof(SSLPort)) && Int32.TryParse(value, out var port):
SSLPort = port;
break;
case var _ when key.SequenceEqual(nameof(NonSSLPort)) && Int32.TryParse(value, out var nonsslport):
NonSSLPort = nonsslport;
break;
default:
break;
}
}
}
}
catch { }
}
/// <summary>
/// The type of event.
/// </summary>
public readonly string NotificationType;
/// <summary>
/// The start time of the event.
/// </summary>
public readonly DateTime? StartTimeUtc;
/// <summary>
/// Whether the event is for a replica node.
/// </summary>
public readonly bool IsReplica;
/// <summary>
/// IPAddress of the node event is intended for
/// </summary>
public readonly IPAddress IpAddress;
/// <summary>
/// SSL port
/// </summary>
public readonly int SSLPort;
/// <summary>
/// Non-SSL Port
/// </summary>
public readonly int NonSSLPort;
/// <summary>
/// Returns a string representing the maintenance event with all of its properties.
/// </summary>
public override string ToString() =>
$"{nameof(NotificationType)}|{NotificationType}|" +
$"{nameof(StartTimeUtc)}|{StartTimeUtc:s}|" +
$"{nameof(IsReplica)}|{IsReplica}|" +
$"{nameof(IpAddress)}|{IpAddress}|" +
$"{nameof(SSLPort)}|{SSLPort}|" +
$"{nameof(NonSSLPort)}|{NonSSLPort}";
}However, after doing that I don't really know that there's any reason to even parse it for our use case, so maybe we just record the string itself (for logs/events/whatever) and move along? If the only practical use of the event parsing is to have specifics we're turning back into a string for debugging...maybe we don't do any work on it at all. If we can log it as-is and that's equally readable as any What are y'all thinking on these fronts? |
|
@NickCraver thanks for explaining! I removed the configuration option since we're not sparing any connections by having it around, as you pointed out. I also cleaned up the tests and condensed the classes into an AzureMaintenanceEvent.AddListenerAsync() method as suggested, and it's so much cleaner now! 😃 The only thing I had some concerns about is the parsing -- I'm not sure if I understood your response correctly (please do correct me if I'm wrong) -- I think we do want to parse the message into a class, since the user might want to do some additional logic based on which event comes back and when the start time is. As for whether we keep the pipe-based iteration or not, I'm happy to switch to using the span method if it's more intuitive! |
NickCraver
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! I left some more comments for refinement, but I'm a little fuzzy on if/why we need this to be public and may be missing info there, so take comments with a grain of salt. I see the mismatch on "do we need to parse" after the latest changes - if we expose, I agree we'd need to parse.
Tried to address everything in comments but happy to chat further - looking great!
| /// <summary> | ||
| /// Azure node maintenance event details | ||
| /// </summary> | ||
| public class AzureMaintenanceEvent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, if we do remove the events, let's make this internal to start with so we can change the surface area if needed in a non-breaking way :)
|
@NickCraver thanks for the excellent review! I've replied to most of the comments, and updated this PR's description with more context (specifically re: parsing and exposing events). Let me know if it makes sense, and hopefully we can sync up if there's anything else we need to align on! :) |
| case var _ when key.SequenceEqual(nameof(NotificationType).ToCharArray()): | ||
| NotificationType = value.ToString(); | ||
| break; | ||
| case var _ when key.SequenceEqual(nameof(StartTimeInUTC).ToCharArray()) && DateTime.TryParse(value.ToString(), out DateTime startTime): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These .ToString() are avoidable; on modern TFMs, there are span overloads on the types directly, but before that: consider System.Buffers.Text.Utf8Parser.TryParse, which has a range of APIs; also - and this is a very important one: think i18n/l10n: I would say "explicitly specify invariant", but I believe if using Utf8Parser: it is assumed - you should probably check on that, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried applying the Utf8Parser, but it deals with ReadOnlySpan instead of ReadOnlySpan, which is what we're working with here. Adding the extra conversion between the two seems like the opposite of what we're trying to do here? Also the documentation doesn't mention anything about cultures, so I think it makes sense to opt for the DateTime.TryParseExact() option for StartTimeUtc so we can specify it explicitly, as per your other comment.
I ended up using the preprocessor directive you mentioned in another comment to get rid of all the .ToString() conversions for the newer frameworks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DateTime parse is of particular concern to me; I wonder if the format should be explicit/"exact" here (i.e.
DateTime.TryParseExact) - or is it sufficient to use the'u'pattern (which is not the default) onUtf8Parser? Does'u'act the same as"u"? (open question; I don't know the answer)
Regarding this comment -- on the Azure side we're sending these events with the "s" (sortable) format, but according to VS 'u' looks like this:
And 'U' looks like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough; whatever works :) My key point: we need to make sure it is culture-insensitive, and we should probably validate that in a test by changing (and restoring in a finally) the thread's culture and ui-culture to something random and suitably awkward
| case var _ when key.SequenceEqual(nameof(IsReplica).ToCharArray()) && bool.TryParse(value.ToString(), out var isReplica): | ||
| IsReplica = isReplica; | ||
| break; | ||
| case var _ when key.SequenceEqual(nameof(IPAddress).ToCharArray()) && IPAddress.TryParse(value.ToString(), out var ipAddress): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: no overload in Utf8Parser here, so this might be a good candidate for:
#if NETSTANDARD2_1_OR_GREATER || NETCOREAPP3_0_OR_GREATER
&& IPAddress.TryParse(value, out var ipAddress):
#else
&& IPAddress.TryParse(value.ToString(), out var ipAddress):
#endif| await sub.SubscribeAsync(PubSubChannelName, (channel, message) => | ||
| { | ||
| var newMessage = new AzureMaintenanceEvent(message); | ||
| multiplexer.InvokeServerMaintenanceEvent(newMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can avoid the AzureMaintenanceEvent alloc when no subscriber, if we just pass down message, i.e.
multiplexer.InvokeServerMaintenanceEvent(message);
// ...
internal void InvokeServerMaintenanceEvent(string message)
=> AzureServerMaintenanceEvent?.Invoke(this, new AzureMaintenanceEvent(message));the args on the right here are only evaluated if the ?. check passes, i.e. we have listeners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when there's no subscriber, we currently want the AzureMaintenanceEvent handy since we access the properties to see what type of event it is and trigger reconfigure in certain cases. Let me know if you disagree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless I'm missing it, all we do is AzureServerMaintenanceEvent?.Invoke with the event; I'm just trying to avoid the overhead if AzureServerMaintenanceEvent turns out to be null, which we can reasonably presume is the default and most common scenario; do we do anything else special in the case when no external consumer has touched the event?
mgravell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a review pass; heading in the right direction, but quite a few suggestions; please please feel free to discuss any of them (not trying to be a dictator here)
|
oh, can we update releasenotes.md too please, in the unreleased section? thanks |
|
The DateTime parse is of particular concern to me; I wonder if the format should be explicit/"exact" here (i.e. |
|
Hey @amsoedal! I figured we'd see you on call today but missed out on syncing up - figured it was easier to talk through some tweaks to make this more extensible to other providers later and not lock ourselves in on the new public types with a few changes. I have the proposals in craver/1865-tweaks but definitely don't want to stomp on PR and credit here. I've granted you write access to this repo - could you please move your branch here and open a new PR (I think all current comments are good to lose). Then we can open a PR from |


Adding an automatic subscription to the AzureRedisEvents pubsub channel for Azure caches. This channel notifies clients of upcoming maintenance and failover events.
By exposing these events, users will be able to know about maintenance ahead of time, and can implement their own logic (e.g. diverting traffic from the cache to another database for the duration of the maintenance event) in response, with the goal of minimizing downtime and disrupted connections. We also automatically refresh our view of the topology of the cluster in response to certain events.
Here are some of the possible notifications: