Skip to content

Comments

Rewrites Site definition#14356

Merged
ddbeck merged 6 commits intomdn:mainfrom
NiedziolkaMichal:patch-6
Mar 30, 2022
Merged

Rewrites Site definition#14356
ddbeck merged 6 commits intomdn:mainfrom
NiedziolkaMichal:patch-6

Conversation

@NiedziolkaMichal
Copy link
Member

Summary

I have added a lot more informations about how "site" is determined and also wrote that scheme may or may not be part of the site.

Motivation

Supporting details

web.dev same-site page

Related issues

#13612

Metadata

  • Adds a new document
  • Rewrites (or significantly expands) a document
  • Fixes a typo, bug, or other error

@NiedziolkaMichal NiedziolkaMichal requested a review from a team as a code owner March 27, 2022 00:11
@NiedziolkaMichal NiedziolkaMichal requested review from ddbeck and removed request for a team March 27, 2022 00:11
@github-actions github-actions bot added the Content:Glossary Glossary entries label Mar 27, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Mar 27, 2022

Preview URLs

Flaws

None! 🎉

External URLs

URL: /en-US/docs/Glossary/Site
Title: Site
on GitHub

(this comment was updated 2022-03-30 16:45:52.127913)

Copy link
Collaborator

@wbamberg wbamberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! I agree that this is a good change to make but I think in some ways this entry is going to be hard for people to understand. It might be helpful to:

  • start with an informal definition of "site"
  • explain why we need a more formal one
  • explain what the more formal one is.

Something like:


Informally, a site is a website, which is a collection of web pages, served from the same domain, and maintained by a single organization.

Browsers sometimes need to distinguish precisely between different sites. For example, the browser must only send SameSite cookies to the same site that set them.

For this more precise definition a site is determined by the registrable domain portion of the domain name. The registrable domain consists of an entry in the Public Suffix List plus the portion of the domain name just before it. This means that, for example, "theguardian.co.uk", "sussex.ac.uk", and "bookshop.org" are all registrable domains.

According to this definition, "support.mozilla.org" and "developer.mozilla.org" are part of the same site, because "mozilla.org" is a registrable domain.

In some contexts the scheme is also considered when differentiating sites. This would make "http://vpl.ca" and "https://vpl.ca" different sites. Including the scheme is useful in particular because it prevents an insecure (http) site= from being treated as the same site as a secure (https) site. This is sometimes called a "schemeful same-site". This stricter definition is applied in the rules for handling SameSite cookies.


...or something with this rough organization anyway.

- WebMechanics
---
The _site_ of a piece of web content is determined by the _registrable domain_ of the host within the origin. This is computed by consulting a _Public Suffix List_ to find the portion of the host which is counted as the _public suffix_ (e.g. `com`, `org` or `co.uk`).
The _site_ is part of a domain name, that identifies a single entity, by considering only public suffix and part of the domain just before it. Typically a public suffix consists only of {{Glossary("TLD")}}, which makes `mozilla.org` along with all its subdomains the `same-site`, while `google.org` or `mozilla.com` are considered `cross-sites`. However a public suffix might consist of multiple levels (e.g. `co.uk`, `qld.edu.au`). That's why _site_ is determined by a query to the [Public Suffix List](https://publicsuffix.org/list/) maintained by Mozilla volunteers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's going to be confusing to many people to start of by saying "The site is part of a domain name". It might be mire accurate to say that the site is determined by part of the domain name.

The _site_ of a piece of web content is determined by the _registrable domain_ of the host within the origin. This is computed by consulting a _Public Suffix List_ to find the portion of the host which is counted as the _public suffix_ (e.g. `com`, `org` or `co.uk`).
The _site_ is part of a domain name, that identifies a single entity, by considering only public suffix and part of the domain just before it. Typically a public suffix consists only of {{Glossary("TLD")}}, which makes `mozilla.org` along with all its subdomains the `same-site`, while `google.org` or `mozilla.com` are considered `cross-sites`. However a public suffix might consist of multiple levels (e.g. `co.uk`, `qld.edu.au`). That's why _site_ is determined by a query to the [Public Suffix List](https://publicsuffix.org/list/) maintained by Mozilla volunteers.

Depending on the document, a used scheme may also be part of a _site_. According to this definition, `https://mozilla.org` and `http://mozilla.org` are `cross-site` just because protocol differs. To avoid confusion it is a good practice to include information about how the scheme is treated, by stating either `schemeful site` or `scheme-less site`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it depending on the document? I would have thought it depended on the application (i.e. the purpose for which the browser is deciding whether 2 URLs are same-site).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By document I ment WHATWG standard. Application is surely a better word to describe it.

@NiedziolkaMichal
Copy link
Member Author

Thank you for feedback, your definition is much better. Could you make a commit, replacing my changes with yours?

@wbamberg
Copy link
Collaborator

Thank you for feedback, your definition is much better. Could you make a commit, replacing my changes with yours?

Thank you @NiedziolkaMichal ! Could you please check that I haven't mangled what you wanted to express here?

@NiedziolkaMichal
Copy link
Member Author

I have, it's perfect.

Copy link
Contributor

@ddbeck ddbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing a few minor fixes, mostly to use just one way to format phrases and domains and URLs. Otherwise, I like how this looks! 👍

wbamberg and others added 4 commits March 30, 2022 09:42
Co-authored-by: Daniel D. Beck <[email protected]>
Co-authored-by: Daniel D. Beck <[email protected]>
Co-authored-by: Daniel D. Beck <[email protected]>
Co-authored-by: Daniel D. Beck <[email protected]>
@wbamberg
Copy link
Collaborator

I never know how to format things like this. But happy with your suggestions :).

Copy link
Contributor

@ddbeck ddbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @NiedziolkaMichal and @wbamberg—this is a really nice update. Plus I now have an actual clue about how a same site works.

@ddbeck ddbeck dismissed wbamberg’s stale review March 30, 2022 18:23

became an author

@ddbeck ddbeck merged commit 64d9c89 into mdn:main Mar 30, 2022
@hamishwillee
Copy link
Collaborator

Yes, this is a really excellent update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Content:Glossary Glossary entries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants