Skip to content

Add AUTH third element#841

Closed
arthurfranca wants to merge 2 commits intonostr-protocol:masterfrom
arthurfranca:nip-42-auth-id
Closed

Add AUTH third element#841
arthurfranca wants to merge 2 commits intonostr-protocol:masterfrom
arthurfranca:nip-42-auth-id

Conversation

@arthurfranca
Copy link
Copy Markdown
Contributor

Currently the relay can send an AUTH message at any moment and client can't know what (or if a) client message triggered it.

This PR tries to bring order by telling exactly when and how the relay can send an AUTH message. It also introduces an optional third array element that can reference a previous client message. This referencing and timing assists the client retry logic.

This is a new attempt to fix NIP-42 instead of adding NIP-43.

@staab
Copy link
Copy Markdown
Member

staab commented Oct 26, 2023

Why does the client need to re-try after auth if they already sent the message? Adding a reason for auth challenge is not a terrible idea, but relays should buffer pending messages per connection until auth is taken care of.

I feel like we need a better problem statement, this feels like a band-aid. Maybe auth needs to be re-worked from first principles.

Some user stories off the top of my head:

  1. I know in advance that this relay is closed and want to quick-auth to reduce latency
  2. I didn't know this relay is closed, sent some messages, and now want to auth and continue
  3. I don't know what this relay is closed to and want to ask in advance what I'm allowed to do
  4. The relay worked for a while without auth, and now it's asking me to auth. Why?
  5. As a relay, the user is authenticated, but doesn't have permission to do X. I want to let them authenticate with an additional pubkey, which I want to specify.

We've talked about #1, it's functionally the same as #2 if relays buffer client messages until auth is done. #3 is probably not a real use case, and can be solved by just trying to do stuff, reading notices, and using the same workflow as #2. #4 and #5 are pretty advanced and could profit from the idea of this PR, but aren't really in the same spirit.

@arthurfranca
Copy link
Copy Markdown
Contributor Author

Buffering is great for the client.

  1. send REQ -> receive some permitted events then relay-AUTH -> send client-AUTH -> receive remaining events then EOSE
  2. send EVENT -> receive relay-AUTH before OK -> send client-AUTH -> receive successful OK

The only problem I see is: for how long should the relay wait for a client-AUTH response before timing out and sending a premature EOSE (or failure OK message)? What if the NIP-07 extension asks user's permission to sign an auth event and it takes 2 min to choose?

Example: Client sends a broad filter { author: ["x"], limit: 2 }. Relay runs the query and it streams 1 event from DB to the backend. It is a kind 4 DM that needs auth. Relay sends AUTH and waits 3 min for client AUTH that never came. The relay kept the DB connection open for 3 min. Other clients may do the same within the 3min window and consume all DB connections?

That's why i thought it could be better for the relay to release the DB connection, send an AUTH referencing the subscription id, which the client could consider a failed REQ it should retry after replying with AUTH.

@mikedilger
Copy link
Copy Markdown
Contributor

That's why i thought it could be better for the relay to release the DB connection, send an AUTH referencing the subscription id, which the client could consider a failed REQ it should retry after replying with AUTH.

Why can't it release the DB connection and keep a queue of requests not yet handled that it references once AUTH comes in?

@arthurfranca
Copy link
Copy Markdown
Contributor Author

If client asked for 1000 events, relay sent 10 then it found out the 11th event needs authentication, what the relay is supposed to do? If it releases the DB connection its gonna have to re-run the query from the start after client authenticates and the query result may have changed a bit in the meantime.

Well it could kinda work if the relay stores the event ids it had already sent to the client to skip them.

@arthurfranca
Copy link
Copy Markdown
Contributor Author

My previous example takes into account the fact that not always the relay will be able to tell if a filter would fetch an event that needs authentication (e.g.: a kind 4 event may appear even for filters without kinds: [4] set). The relay may find out the client should authenticate just in the middle of serving a REQ.

If the relay don't use a type of DB streaming, it would be holding all the 990 events (from above example) in memory until client authenticates. Or it would have to re-run the full query.

If the relay is gonna have to re-run the query anyway, using this PR makes it so the relay code doesn't change much and the responsibility to ask to re-run the query is shifted to the client.

But if you guys think buffering is better I can change the text.

42.md Outdated
Comment on lines +64 to +67
- `["AUTH", <challenge-string>, "sub:<subscription_id>"]`: It references a previous client `REQ` message with a specific subscription id as third array element.
It means client subscription requested atleast one event that requires prior authentication to be accessed.
MUST be sent before the `EOSE` message with same subscription id. The relay SHOULD send no events before `EOSE` because the client is expected to retry the `REQ` after authentication;
- `["AUTH", <challenge-string>, "evt:<event_id>"]`: It references a previous client `EVENT` message with a specific event id as third array element.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we separate the third argument into command and identifier? That way the AUTH message payload more explicitly points to a specific command that triggered auth. This could be extended cleanly to other things, like HASH or COUNT for example. So something like:

["AUTH", <challenge-string>, "REQ", <subscription_id>]
["AUTH", <challenge-string>, "EVENT", <event_id>]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better indeed

Comment on lines +71 to +72
*Note: The challenge string is the same during the whole WebSocket connection. Multiple relay `AUTH` messages on the same connection
carry the same challenge string.*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit this is theoretical, but I think it could be useful to authenticate multiple times during a session without closing the connection and re-opening it (although that's probably a fine approach too). What about this:

The challenge string SHOULD be the same for the duration of the client session. Relays MAY change the challenge string if they wish to indicate that a new session with different authentication requirements has been initiated (for example when changing identities or making a claim with a separate identity).

For now, it's probably best to just keep this open-ended and say:

The challenge string SHOULD be the same for the duration of the client session.

Separately, it would be good to clarify that the relay can send multiple AUTH messages specifying different messages the client attempted to send:

Relays SHOULD issue the same challenge multiple times, each time indicating a different message the client has sent that was not accepted due to AUTH.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...for example when changing identities...

I think this was ditched in the past in favor of simply reconnecting. For example, #891 shouldn't be used as a way of switching accounts.

I prefer lowering complexity but I will change it to your version as a middle ground.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer lowering complexity but I will change it to your version as a middle ground.

No, I think we should leave this out for now. I think we'll want it eventually, but it's better to hew to YAGNI when possible. Re-connecting is a fine way to solve this in the near term.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok didn't change it

@staab
Copy link
Copy Markdown
Member

staab commented Nov 21, 2023

See my comment on #846, I think there's complexity in either direction, and this is probably the better approach.

42.md Outdated
If applicable, it MUST be sent right after WebSocket `connection`.
Client may have already sent a message before receiving this one, in that case relay MUST also send an `AUTH` message in one of the other forms below;
- `["AUTH", <challenge-string>, "sub:<subscription_id>"]`: It references a previous client `REQ` message with a specific subscription id as third array element.
It means client subscription requested atleast one event that requires prior authentication to be accessed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to "It means client subscription potentially requested at least one event"?

relays shouldn't be expected to run queries which could require auth. if clients are writing properly restricted filters, this should be easy for relays to immediately discern whether authentication is needed. if client's filters are too broad (ie don't specify kind) then that should be the client's problem, not the relay's problem.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

42.md Outdated
MUST be sent before the `EOSE` message with same subscription id. The relay SHOULD send no events before `EOSE` because the client is expected to retry the `REQ` after authentication;
- `["AUTH", <challenge-string>, "evt:<event_id>"]`: It references a previous client `EVENT` message with a specific event id as third array element.
It means relay requires authentication before publishing any or atleast this kind of event.
MUST be sent before `OK` message with same event id.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behavior here should be consistent with REQ auth flow. can we add "Client is expected to retry the EVENT after authentication"?

also maybe specify that OK message should have value of false.

@monlovesmango
Copy link
Copy Markdown
Member

agree with @staab this is the better approach of the two PRs. thanks for your persistence in trying to improve nip42 @arthurfranca !

@arthurfranca arthurfranca reopened this Nov 21, 2023
@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

@arthurfranca @mikedilger @staab instead of changing AUTH, what do you think of coming up with a new message for rejecting REQs similar to OK, with a human-machine readable "reason", and it could include some auth requirements in it?

It could be something like

["REQ", "xyz", {}]
["NO", "xyz", "unsupported: query is too broad, can't handle it"]

or

["REQ", "xyz", {"kinds": [4]}]
["NO", "xyz", "auth-required: private messages require authentication", "<challenge>"]

(The fourth parameter would depend on the reason prefix of the third or something like that).

@arthurfranca
Copy link
Copy Markdown
Contributor Author

Not bad. But what if a paid relay requires auth for everything, including EVENT and COUNT?

@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

OK, so we can forget about the fourth parameter, this makes things even easier:

query:

["REQ", "xyz", {"kinds": [4]}]
["AUTH", "<challenge>"]
["NO", "xyz", "auth-required: private messages require authentication"]

publishing an event:

["EVENT", {"id": "abcdef...", ...}]
["AUTH", "<challenge>"]
["OK", "abcdef...", false, "auth-required: publishing requires authentication"]

Clients can just store a single AUTH challenge for each relay. Whenever they get an OK-false or a NO message with the "auth-required: " prefix they can just check if they have an auth challenge stored for that relay and decide if they will continue with the auth flow.

Receiving a NO effectively terminates the REQ, so clients can perform the AUTH and then send the REQ again.

@monlovesmango
Copy link
Copy Markdown
Member

whats the advantage of this strategy over adding third auth element?

if you think we need some kind of "NO" message anyway and this is a good place to utilize it I would tend to agree. rather than "NO" I think "ERR", "FAIL", "REJ", or "DENIED" would be more descriptive message labels. might also be beneficial to have standardized error codes so clients can handle different errors precisely? otherwise people will likely start writing whatever messages they want when rejecting a REQ (as some NOTICEs are actually more appropriate as a REQ rejection).

@staab
Copy link
Copy Markdown
Member

staab commented Nov 27, 2023

instead of changing AUTH, what do you think of coming up with a new message

I wouldn't mind a more completely specified machine-readable version of NOTICE, which is currently only machine-readable because of relay implementation details. But AUTH is the way to send a challenge. Correlating NO messages and AUTH challenges to create a good UX seems like unnecessary busy work. If AUTH with a challenge wasn't already in use I could see doing something like:

["DENIED", "<command-name>", "<command-id>", "<machine-readable-reason>", "<human-readable error message>"]

But this is essentially the same as this proposed extension to AUTH, except that it includes more information. I can't think of an example of when a relay would want to grant entry using any other mechanism than auth.

@monlovesmango
Copy link
Copy Markdown
Member

monlovesmango commented Nov 27, 2023

I can't think of an example of when a relay would want to grant entry using any other mechanism than auth.

true, but I can think of examples for when a relay would deny a req for another reason other than AUTH:

  • duplicate subscription
  • too many subscriptions
  • overly broad search
  • too many pubkeys/ids or filter size is too large
  • rate limiting

@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

whats the advantage of this strategy over adding third auth element?

The advantage is what you pointed yourself in your next comment: that it supports other kinds of request failures that today go completely unaccounted for.

rather than "NO" I think "ERR", "FAIL", "REJ", or "DENIED" would be more descriptive message labels.

doesn't matter. I think NO looks prettier but I'm fine with these. Except "ERR", because that's too generic.

Correlating NO messages and AUTH challenges to create a good UX seems like unnecessary busy work.

I just described how to do that in pseudocode above and it wasn't hard at all as long as the AUTH message comes before the NO. Maybe I missed something, but it would be nice if you pointed what I missed.

@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

Basically we have two distinct problems:

  1. knowing why a relay is asking for AUTH
  2. knowing why a subscription has failed

adding a third element to AUTH only solves them in the occasion that the two happen together.

creating a new message solves 2 and making AUTH be subservient to that (and also to OK) solves 1.

@arthurfranca
Copy link
Copy Markdown
Contributor Author

It does work with ["AUTH", "<challenge>"] meaning "here's the challenge just in case you need it" instead of "authenticate now!" and then later a NO/DENIED or a failure OK message with auth-required code linked to a specific client message by subid/eventid.

["DENIED", "<command-name>", ...

Now that we are using regular OK msg for client EVENT msg response, I guess maybe we don't really need to specify the command name anymore and just rely on REQ/COUNT/ANY_FUTURE_CLIENT_MESSAGE's (subscription) id.

@monlovesmango
Copy link
Copy Markdown
Member

I do like @staab 's suggestion of having a machine-readable-reason.

@staab
Copy link
Copy Markdown
Member

staab commented Nov 27, 2023

true, but I can think of examples for when a relay would deny a req for another reason other than AUTH:

  • duplicate subscription
  • too many subscriptions
  • overly broad search
  • too many pubkeys/ids or filter size is too large
  • rate limiting

Those are all arguably programmer errors, and should be covered by NOTICE. If a client dev is going to the trouble of parsing those errors and handling them, they should just cover those edge cases instead.

I just described how to do that in pseudocode above and it wasn't hard at all as long as the AUTH message comes before the NO.

Not hard, just unnecessary busy work.

Ironically I think I'm defending fiatjaf's usual position of avoiding premature abstraction. I don't see a need for a generic machine-readable error mechanism, because AUTH covers all the use cases where a user can actually do something. Even if we admit some of the use cases above (rate limiting maybe), now we're specifying something extremely abstract, and should be careful not to create another overly-broad overloaded solution.

@monlovesmango
Copy link
Copy Markdown
Member

Those are all arguably programmer errors, and should be covered by NOTICE. If a client dev is going to the trouble of parsing those errors and handling them, they should just cover those edge cases instead

its impossible to cover all edge cases when the edge cases vary from relay to relay. and right now I doubt any client dev is parsing those errors and handling them because its impossible to maintain due to arbitrary notice messages. having machine readable reason would at least allow this as a possibility.

@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

Ironically I think I'm defending fiatjaf's usual position of avoiding premature abstraction. I don't see a need for a generic machine-readable error mechanism, because AUTH covers all the use cases where a user can actually do something. Even if we admit some of the use cases above (rate limiting maybe), now we're specifying something extremely abstract, and should be careful not to create another overly-broad overloaded solution.

I appreciate that, but I think it won't hurt at all to have a message akin to OK but specifically for REQs.

Before the invention of OKs my relay implementation used NOTICE for everything that went wrong when receiving an EVENT from a client. Now it uses OK. However, for all failures related to REQ my relay implementations return a NOTICE, which unfortunately or fortunately all clients have decided to ignore.

Anyway, I don't think the introduction of OK has hurt anyone, it improved everybody's code an gave everybody's a little more security and sleep cycles. For example: before OK, my client library implementations would open a REQ to the same relay they had just published an event and see if they got the same event back in order to confirm if the event was published. If the publication failed for some reason then it wouldn't know and was unable to notify anyone, then I added a timeout, but deciding how long the timeout should be was a problem: if it was too small then maybe it would miss relays that were just slow, but if it was too big then the UX would suffer when users could just be notified their event wasn't stored right away.

I think having a generic message for failures in REQs will have the same good effects, and the fact that it ties nicely into AUTH also solves the problem of AUTH not being used by anyone.

@fiatjaf
Copy link
Copy Markdown
Member

fiatjaf commented Nov 27, 2023

I've published #902, which I think solves everything.

It also solves another problem that I have faced myself: what to do when your relay doesn't support ongoing subscriptions, but just immediate sync queries? The answer is: return all the events from the sync queries, then an EOSE, then a CLOSED. This allows clients to disconnect if they want and not keep that connection open with the hopes of ever receiving an event from a subscription, less energy consumption, less global warming.

@arthurfranca
Copy link
Copy Markdown
Contributor Author

Closing in favor of #902

@jb55
Copy link
Copy Markdown
Contributor

jb55 commented Nov 27, 2023 via email

@arthurfranca arthurfranca deleted the nip-42-auth-id branch May 9, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants