Skip to content

Feat: add serve static with#113

Merged
lpil merged 10 commits intogleam-wisp:mainfrom
ross-byrne:feat-add-serve-static-with
Mar 21, 2025
Merged

Feat: add serve static with#113
lpil merged 10 commits intogleam-wisp:mainfrom
ross-byrne:feat-add-serve-static-with

Conversation

@ross-byrne
Copy link
Copy Markdown
Contributor

@ross-byrne ross-byrne commented Mar 12, 2025

Resolves #103

Adds a new function serve_static_with which takes options for enabling etags and setting response headers for specific file types.

Context

There are two main use cases this PR is trying to address and they are, from what I can tell, the two more common scenarios where you'd want caching. Caching it too complicated for us to support every possible option.

Use Case 1

A user is serving a site, webapp or SPA that is built using a build tool (bias: my use case). The build tool fingerprints the files, so the names change when the contents change. This seems to be normal for a lot of frameworks. In this case, you can safely tell the browser to cache the JS and CSS for the longest possible time, as long as you don't cache your index.html or equivalent entry point. You can do this by setting the cache-control header to something like: cache-control: max-age=31536000, immutable, private. This provides the best performance because the browser will instantly use the cached JS/CSS and will only request new versions if something changes in the index.html file.

Use Case 2

The second use case is, you don't have unique filename fingerprinting but you still want to cache assets to speed up page loads. This is where you reach for Etags. You get most of the performance of not having to re-download all the JS on page reload but without having to use a build tool to fingerprint the file names. The downside is, the browser will still have to hit the server to check the Etag values. So you don't save as much bandwidth, time and compute as the first option but still save a lot compared to not caching anything. The use case for supporting both Etags AND setting custom headers is, you can configure how long the browser will wait to re-validate assets with an Etag or use cache-control: stale-while-revalidate=<some_value> to allow the browser to use old assets while waiting for the validation to happen.

Conclusion

I hope the reasoning here is a bit clearer. It's a messy topic that has more than one way of solving, so I tried to go with the most flexible option. I did also take inspiration from other frameworks that allow setting headers separately from toggling Etags, such as hono, express. There was a bigger list in the issue as well.

@ross-byrne
Copy link
Copy Markdown
Contributor Author

@lpil Here's the PR for #103. Feedback is probably needed here, particularly around naming and the new types being introduced.

Thanks 👍

@ross-byrne
Copy link
Copy Markdown
Contributor Author

Question: do we want to add an example for this or will the documentation be enough?

Copy link
Copy Markdown
Collaborator

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Do we not want to always use etags?

What's the use case for setting headers, and the use case for setting headers only for certain file extensions?

@ross-byrne
Copy link
Copy Markdown
Contributor Author

@lpil Great questions. Apologies in advance for the short novel, I should have included more context in the PR description.

Do we not want to always use etags?

No, not necessarily. Using etags as a caching strategy is a trade off, like most things. Etag generation is not that expensive but it's not free. The browser will also always have to hit the server to make sure the resource is still valid. That is faster than downloading the file again but slower than the browser using the cached version without verifying it first.

What's the use case for setting headers, and the use case for setting headers only for certain file extensions?

The use case for setting headers only for certain file extensions is, generally you don't want to cache markup files for too long but you do want to cache things like JS and CSS. As an example, say you have an index.html that references different JS and other files. You might want to cache those other resources but not the markup, so when you update the index.html and it references the new updated files, the browser will pull them down instead of using the cached versions (if the names are different). If the markup is cached, you basically can't get changes to the user without coding some specific solution.

To take a step back, there are two main use cases this PR is trying to address and they are, from what I can tell, the two more common scenarios where you'd want caching. Caching it too complicated for us to support every possible option.

Use Case 1

A user is serving a site, webapp or SPA that is built using a build tool (bias: my use case). The build tool fingerprints the files, so the names change when the contents change. This seems to be normal for a lot of frameworks. In this case, you can safely tell the browser to cache the JS and CSS for the longest possible time, as long as you don't cache your index.html or equivalent entry point. You can do this by setting the cache-control header to something like: cache-control: max-age=31536000, immutable, private. This provides the best performance because the browser will instantly use the cached JS/CSS and will only request new versions if something changes in the index.html file.

Use Case 2

The second use case is, you don't have unique filename fingerprinting but you still want to cache assets to speed up page loads. This is where you reach for Etags. You get most of the performance of not having to re-download all the JS on page reload but without having to use a build tool to fingerprint the file names. The downside is, the browser will still have to hit the server to check the Etag values. So you don't save as much bandwidth, time and compute as the first option but still save a lot compared to not caching anything. The use case for supporting both Etags AND setting custom headers is, you can configure how long the browser will wait to re-validate assets with an Etag or use cache-control: stale-while-revalidate=<some_value> to allow the browser to use old assets while waiting for the validation to happen.

Conclusion

I hope the reasoning here is a bit clearer. It's a messy topic that has more than one way of solving, so I tried to go with the most flexible option. I did also take inspiration from other frameworks that allow setting headers separately from toggling Etags, such as hono, express. There was a bigger list in the issue as well.

@lpil
Copy link
Copy Markdown
Collaborator

lpil commented Mar 14, 2025

Thank you!

Etag generation is not that expensive but it's not free. The browser will also always have to hit the server to make sure the resource is still valid. That is faster than downloading the file again but slower than the browser using the cached version without verifying it first.

Avoiding a single file_info call doesn't seem like a good justification, especially since the previous implementation already did it once to determine if the file exists, and the new one does it twice. We could use that first read for both, and then always set etags.

To take a step back, there are two main use cases this PR is trying to address and they are, from what I can tell, the two more common scenarios where you'd want caching. Caching it too complicated for us to support every possible option.

I can see why you'd want to add cache headers, but the API given seems either too general or too restrictive. I think we should either have specific support for caching headers, or it should be a much more flexible API than having a fixed set of any headers to be attached to a single fixed set of file extensions.

How about we be conservative with the scope of this PR and focus on adding etags, and then add add a default cache-control, similar to how Plug.Static does it https://hexdocs.pm/plug/1.15.3/Plug.Static.html#module-cache-mechanisms.

To make the API forwards-compatible let's use a builder API for the static asset configuration. Then we can add more functionality to it in future without doing a major version bump.

@ross-byrne
Copy link
Copy Markdown
Contributor Author

Avoiding a single file_info call doesn't seem like a good justification, especially since the previous implementation already did it once to determine if the file exists, and the new one does it twice. We could use that first read for both, and then always set etags.

A fair point, I agree. My main motivation was more, I want to set the cache-control headers directly to avoid the issue. Perhaps following Plug and making etags a default that can be overridden in the future is the best path forward.

How about we be conservative with the scope of this PR and focus on adding etags, and then add add a default cache-control, similar to how Plug.Static does it https://hexdocs.pm/plug/1.15.3/Plug.Static.html#module-cache-mechanisms.

I think this is a good compromise. I wasn't aware of Plug but I like how they handle the config. I do still think it's important to have control over setting response headers and controlling what files get served, but I agree my proposed solution isn't totally satisfying either. I think we could use Plug as an example rather than trying to re-invent. Let's stick with Etags for now and figure out the rest after.

To make the API forwards-compatible let's use a builder API for the static asset configuration. Then we can add more functionality to it in future without doing a major version bump.

I hadn't considered the builder pattern but that's a great idea. In terms of inspiration, would it be good to look at how POG handles config?

@lpil
Copy link
Copy Markdown
Collaborator

lpil commented Mar 14, 2025

Pog would be a good one to look at, aye! I'm not sure if it would be in a different module or not though.

@ross-byrne
Copy link
Copy Markdown
Contributor Author

I'm not sure if it would be in a different module or not though

I was wondering the same. It would be cleaner to have it as a module but maybe we can leave it in the main wisp module and see what it looks like? Then we can move it pretty easily if we want before merging.

So to recap on what the plan is. We're still adding the serve_static_with function but it's going to take a StaticConfig (or something like that) that can be set using a builder pattern. The only setting being etag generation for now, which is on by default.

I'm imagining something roughly like the following:

use <- wisp.serve_static_with(
     req,
     under: "/static",
     from: priv,
     config: wisp.default_static_config()
)

or manually setting the config like:

let config = wisp.default_static_config()
  |> wisp.with_etag_generation

use <- wisp.serve_static_with(req, under: "/static", from: priv, config:)

Thoughts on the above before I start working on it?

@lpil
Copy link
Copy Markdown
Collaborator

lpil commented Mar 18, 2025

So to recap on what the plan is. We're still adding the serve_static_with function but it's going to take a StaticConfig (or something like that) that can be set using a builder pattern. The only setting being etag generation for now, which is on by default.

No, we want to always have etag generation on as we've not found any reason to not have it. If we have further configuration we want to have a builder API, not a configuration record as a final argument to a new _with function.

@ross-byrne
Copy link
Copy Markdown
Contributor Author

ross-byrne commented Mar 18, 2025

Ah OK, I misunderstood. So I'll just add etag generation to the existing serve_static function and not add anything new?

If we have further configuration we want to have a builder API, not a configuration record as a final argument to a new _with function.

OK sure. I don't have a clear picture of what that would look like right now but it's not important for this PR.

@ross-byrne
Copy link
Copy Markdown
Contributor Author

@lpil I've made that change now

Copy link
Copy Markdown
Collaborator

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is still reading for file information twice though, let's make it do it only once as discussed by removing the is_file check. Inlining the generate_etag function may make it clearer.

Could you update the changelog also please 🙏

@ross-byrne
Copy link
Copy Markdown
Contributor Author

Updated. I think we should be more or less there now.

Copy link
Copy Markdown
Collaborator

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! One last thing

src/wisp.gleam Outdated
case simplifile.is_file(path) {
Ok(True) ->
case simplifile.file_info(path) {
Ok(file_info) -> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test to make sure that if the file is a directory then it returns a 404 🙏

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. This actually didn't work the way it looked like it did. We still have to check file_info for if it's a file or directory.

I added the extra check, it's not as clean looking as before but file_info_type just does a bitwise op on file_info.mode. So we're still only reading the file info once. Let me know what you think.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, based on the current functionality, serve_static returns OK with an empty body if the file can't be found. So I added a test to make sure the same happens with a directory.


// Get a directory
let response =
testing.get("/stuff/", [])
Copy link
Copy Markdown
Collaborator

@lpil lpil Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This directory doesn't exist, so this test doesn't test what it claims to test. It's instead a file-not-found test.

Please move this into a new test (as each test should only test one thing) and also assert that the directory exists, to catch this problem should directory not exist by mistake.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory does exist. In this test /stuff is mapped to the root directory. Removing the file check causes this to fail, confirming it.

But that aside, fair point. I've moved the directory test to its own test and added a check to assert what's being requested is actually a directory. I also changed it to request the /test directory to be more similar to the other tests, which request files inside that one.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Sorry about that, I misread the code

Copy link
Copy Markdown
Collaborator

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!!!

@lpil lpil merged commit 6083d4c into gleam-wisp:main Mar 21, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting Cache-Control header with serve_static

2 participants