Conversation
|
Question: do we want to add an example for this or will the documentation be enough? |
lpil
left a comment
There was a problem hiding this comment.
Thank you!
Do we not want to always use etags?
What's the use case for setting headers, and the use case for setting headers only for certain file extensions?
|
@lpil Great questions. Apologies in advance for the short novel, I should have included more context in the PR description.
No, not necessarily. Using etags as a caching strategy is a trade off, like most things. Etag generation is not that expensive but it's not free. The browser will also always have to hit the server to make sure the resource is still valid. That is faster than downloading the file again but slower than the browser using the cached version without verifying it first.
The use case for setting headers only for certain file extensions is, generally you don't want to cache markup files for too long but you do want to cache things like JS and CSS. As an example, say you have an To take a step back, there are two main use cases this PR is trying to address and they are, from what I can tell, the two more common scenarios where you'd want caching. Caching it too complicated for us to support every possible option. Use Case 1A user is serving a site, webapp or SPA that is built using a build tool (bias: my use case). The build tool fingerprints the files, so the names change when the contents change. This seems to be normal for a lot of frameworks. In this case, you can safely tell the browser to cache the JS and CSS for the longest possible time, as long as you don't cache your index.html or equivalent entry point. You can do this by setting the cache-control header to something like: Use Case 2The second use case is, you don't have unique filename fingerprinting but you still want to cache assets to speed up page loads. This is where you reach for Etags. You get most of the performance of not having to re-download all the JS on page reload but without having to use a build tool to fingerprint the file names. The downside is, the browser will still have to hit the server to check the Etag values. So you don't save as much bandwidth, time and compute as the first option but still save a lot compared to not caching anything. The use case for supporting both Etags AND setting custom headers is, you can configure how long the browser will wait to re-validate assets with an Etag or use ConclusionI hope the reasoning here is a bit clearer. It's a messy topic that has more than one way of solving, so I tried to go with the most flexible option. I did also take inspiration from other frameworks that allow setting headers separately from toggling Etags, such as hono, express. There was a bigger list in the issue as well. |
|
Thank you!
Avoiding a single
I can see why you'd want to add cache headers, but the API given seems either too general or too restrictive. I think we should either have specific support for caching headers, or it should be a much more flexible API than having a fixed set of any headers to be attached to a single fixed set of file extensions. How about we be conservative with the scope of this PR and focus on adding etags, and then add add a default cache-control, similar to how Plug.Static does it https://hexdocs.pm/plug/1.15.3/Plug.Static.html#module-cache-mechanisms. To make the API forwards-compatible let's use a builder API for the static asset configuration. Then we can add more functionality to it in future without doing a major version bump. |
A fair point, I agree. My main motivation was more, I want to set the cache-control headers directly to avoid the issue. Perhaps following Plug and making etags a default that can be overridden in the future is the best path forward.
I think this is a good compromise. I wasn't aware of Plug but I like how they handle the config. I do still think it's important to have control over setting response headers and controlling what files get served, but I agree my proposed solution isn't totally satisfying either. I think we could use Plug as an example rather than trying to re-invent. Let's stick with Etags for now and figure out the rest after.
I hadn't considered the builder pattern but that's a great idea. In terms of inspiration, would it be good to look at how POG handles config? |
|
Pog would be a good one to look at, aye! I'm not sure if it would be in a different module or not though. |
I was wondering the same. It would be cleaner to have it as a module but maybe we can leave it in the main wisp module and see what it looks like? Then we can move it pretty easily if we want before merging. So to recap on what the plan is. We're still adding the I'm imagining something roughly like the following: use <- wisp.serve_static_with(
req,
under: "/static",
from: priv,
config: wisp.default_static_config()
)or manually setting the config like: let config = wisp.default_static_config()
|> wisp.with_etag_generation
use <- wisp.serve_static_with(req, under: "/static", from: priv, config:)Thoughts on the above before I start working on it? |
No, we want to always have etag generation on as we've not found any reason to not have it. If we have further configuration we want to have a builder API, not a configuration record as a final argument to a new |
|
Ah OK, I misunderstood. So I'll just add etag generation to the existing
OK sure. I don't have a clear picture of what that would look like right now but it's not important for this PR. |
|
@lpil I've made that change now |
|
Updated. I think we should be more or less there now. |
lpil
left a comment
There was a problem hiding this comment.
Looks great! One last thing
src/wisp.gleam
Outdated
| case simplifile.is_file(path) { | ||
| Ok(True) -> | ||
| case simplifile.file_info(path) { | ||
| Ok(file_info) -> { |
There was a problem hiding this comment.
Could you add a test to make sure that if the file is a directory then it returns a 404 🙏
There was a problem hiding this comment.
Great catch. This actually didn't work the way it looked like it did. We still have to check file_info for if it's a file or directory.
I added the extra check, it's not as clean looking as before but file_info_type just does a bitwise op on file_info.mode. So we're still only reading the file info once. Let me know what you think.
There was a problem hiding this comment.
Also, based on the current functionality, serve_static returns OK with an empty body if the file can't be found. So I added a test to make sure the same happens with a directory.
test/wisp_test.gleam
Outdated
|
|
||
| // Get a directory | ||
| let response = | ||
| testing.get("/stuff/", []) |
There was a problem hiding this comment.
This directory doesn't exist, so this test doesn't test what it claims to test. It's instead a file-not-found test.
Please move this into a new test (as each test should only test one thing) and also assert that the directory exists, to catch this problem should directory not exist by mistake.
There was a problem hiding this comment.
The directory does exist. In this test /stuff is mapped to the root directory. Removing the file check causes this to fail, confirming it.
But that aside, fair point. I've moved the directory test to its own test and added a check to assert what's being requested is actually a directory. I also changed it to request the /test directory to be more similar to the other tests, which request files inside that one.
There was a problem hiding this comment.
Oh! Sorry about that, I misread the code
Resolves #103
Adds a new function
serve_static_withwhich takes options for enabling etags and setting response headers for specific file types.Context
There are two main use cases this PR is trying to address and they are, from what I can tell, the two more common scenarios where you'd want caching. Caching it too complicated for us to support every possible option.
Use Case 1
A user is serving a site, webapp or SPA that is built using a build tool (bias: my use case). The build tool fingerprints the files, so the names change when the contents change. This seems to be normal for a lot of frameworks. In this case, you can safely tell the browser to cache the JS and CSS for the longest possible time, as long as you don't cache your index.html or equivalent entry point. You can do this by setting the cache-control header to something like:
cache-control: max-age=31536000, immutable, private. This provides the best performance because the browser will instantly use the cached JS/CSS and will only request new versions if something changes in the index.html file.Use Case 2
The second use case is, you don't have unique filename fingerprinting but you still want to cache assets to speed up page loads. This is where you reach for Etags. You get most of the performance of not having to re-download all the JS on page reload but without having to use a build tool to fingerprint the file names. The downside is, the browser will still have to hit the server to check the Etag values. So you don't save as much bandwidth, time and compute as the first option but still save a lot compared to not caching anything. The use case for supporting both Etags AND setting custom headers is, you can configure how long the browser will wait to re-validate assets with an Etag or use
cache-control: stale-while-revalidate=<some_value>to allow the browser to use old assets while waiting for the validation to happen.Conclusion
I hope the reasoning here is a bit clearer. It's a messy topic that has more than one way of solving, so I tried to go with the most flexible option. I did also take inspiration from other frameworks that allow setting headers separately from toggling Etags, such as hono, express. There was a bigger list in the issue as well.