Skip to content

Caddy was on a feature freeze before the 2.8 release #5704

@mholt

Description

@mholt

It's been figuratively sticky-noted right in front of my eyes ever since releasing Caddy 2: we need more tests.

Now that 2.7 is done, I want to be confident about our code base before we release updates.


UPDATE, NOV. 2023: See my comment below; life had other plans, so I'm thawing the feature freeze for practical reasons. We will revisit testing next year.


Together with finally focusing on the new Caddy website that I've already been working on only occasionally since the beginning of this year, I will be dedicating my remaining development energy to overhauling and improving Caddy's testing situation.

To be clear, we have a variety of tests today. Many functional units are tested, some quite thoroughly. (See any _test.go file in our repo.) And we have a whole directory tree dedicated to testing Caddy, especially a large pile of Caddyfile adapter tests. And our tests have saved us many times and prevented many problems.

But we know now that this is not enough. The 2.7 release is evidence of that. It took us 4 releases (2.7.0, then 3 hotfixes) to get it right. Web servers are hard, they're complicated, there's a bajillion UI and API surfaces (which, by the way, is the reason using semver for Caddy 2 causes me great anguish -- but I digress), and we have several years of production experience with Caddy 2 to know the nuances and needs of production deployments. It's getting easier to break things.

I'm putting a feature freeze on Caddy until after the 2.8 release so we can focus on testing. Caddy 2.8 will be all about tests: testing harnesses, test APIs for module developers, unit tests, integration tests, configuration tests, etc...

During this feature freeze, we'll refrain from making unnecessary changes to the code base. "Necessary" changes are defined as simple, unintrusive bug fixes only. The main exception being if a bug is severe (crash, memory leak, security vulnerability), we do what we have to do to patch it without delay.

Test code is obviously exempt from this freeze. The whole point is to build out testing features, write more tests, etc.

A few other things that are probably OK during the freeze:

  • Typo corrections
  • Anything pertaining only to comments
  • Localized code cleanup (for example, code marked with a TODO that was needed only for an older version of Go that we no longer use can be removed)
  • Other very minor, sensible improvements

I can't predict what kinds of situations we'll encounter in the future, so this feature freeze is not an absolute, hard rule. It's merely a guideline to keep us focused on the goal for 2.8: move slow, don't break things, and make a robust test suite. The rules for this freeze may change throughout the freeze based on discussion and consensus.

One other thing that we might also do during this freeze is extending the power of testing from just Caddy developers ("compile-time") to site owners and production instances ("deploy-time"). For example, site owners could write a set of tests that their new config has to pass in order to be loaded; Caddy could run these tests automatically during a config reload. This is clearly a user-facing feature, but also provides strong guarantees for users with testing. So that might be acceptable in 2.8. We'll see.

What about dependencies? I think it depends (get it?). Generally it's a good idea to keep dependencies up-to-date for security and performance reasons. Of course, upgrading dependencies can also cause bugs. In general, I think we'll decide these on a case-by-case basis. For example, it's important for us to keep acmez, CertMagic, x/net, and x/crypto up-to-date. We should probably get in the habit of upgrading dependencies just after releases, rather than just before. But also, it shouldn't matter with enough thorough, automated tests. That's the goal: worry-free changes and dependency updates because we're confident testing can catch problems.

What about sponsor requests? We can still accommodate requirements of sponsors (according to their tier). We just may keep their features in a branch until after 2.8.


By the way, I have no interest in test coverage. That sounds strange, but what I mean is that I have no interest in measuring the "test coverage" metric. You know, counting the lines that are evaluated during a test run. To me, that metric is worthless. I can write tests that have 100% coverage, but make no assertions, leaving my code no better off (in fact it's worse, with a false sense of security). I think test coverage tools can be helpful to know what lines of code affect a particular test case, but I don't care about achieving a metric. Instead, I just want lots of tests. I want to cover inputs more than outputs if that makes sense. The possible combinations that Caddy can encounter are intractable. But we can strive to cover as many configurations and flows as possible.


Thanks for joining us on this journey, I look forward to an even more reliable, more robust web server when we're done!

Feel free to discuss below if you have any thoughts or ideas to contribute.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions