Skip to content

Address issues with /mdm/checkin endpoint#871

Merged
jessepeterson merged 1 commit intomicromdm:mainfrom
korylprince:pkcs7-errors
Apr 10, 2023
Merged

Address issues with /mdm/checkin endpoint#871
jessepeterson merged 1 commit intomicromdm:mainfrom
korylprince:pkcs7-errors

Conversation

@korylprince
Copy link
Contributor

This PR is to address some issues discovered while troubleshooting enrollment issues on the MacAdmins Slack:

  • The device cert pkcs7 verification currently used on the /mdm/checkin doesn't account for time skew
    • Since the device is generating the certificate on the fly, if the device is just a few seconds ahead of the server, the server current time will be outside of the certificate validity, and verification will fail
    • This is fixed by adding a 5 minute skew with VerifyWithChainAtTime
  • Most of the errors in the mdm Endpoints are silently dropped because the response encoder called the error encoder directly, instead of just returning the error so it would pass back through the error logger.
    • This is fixed by making the response encoder returning errors.

Copy link
Member

@jessepeterson jessepeterson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my observation here is that in however many years we've been going we've never had the timestamp issue come up on cert verification. Further I'd want to determine where 5 minutes comes from — did we just pick that? I think the SCEP server tries to issue certs from -10 minutes ago — but of course that's for new enrollments. And then if we did want an arbitrary time in there we should make it a documented const.

I'd probably learn towards not wanting to add a specific verification timestamp exemption — especially if errors will be properly logged now (i.e. the case that brought this up would probably have been caught).

@korylprince
Copy link
Contributor Author

Yeah the 5 minutes was a bit arbitrary, but a fairly safe bet. For instance, this is what Microsoft recommends. I added a commit to move it to a documented var.

I'm also surprised this is the first time we've seen it. I'm not sure, but it seems like the cert in question wasn't the SCEP cert, but a cert generated by the device (so maybe this was OTA enrollment? I don't really have any experience with that). The biggest reason I think that is because how else would this situation arise? If it was a SCEP cert, it would be generated by the server, so it wouldn't really matter what the device's time was. I think this would also answer the question of why it hasn't been seen yet: most people using MicroMDM are using DEP, so they would just use the SCEP cert to begin with. This particular issue occurred when someone was enrolling an Apple TV with Configurator.

Apologies if I'm not understanding these processes correctly; I've read through the wiki and the OTA code, but I'm still not 100% clear on the process. Would it even make sense to see a device-generated cert on the /mdm/checkin endpoint?

Assuming I am correct above, then adding a skew (of 5 minutes or whatever) makes a lot of sense. If a device is generating it's own cert on the fly, if it's just a few seconds ahead of the server, then it will be out of the validity period. When @tgunz synced his server with NTP, it was only 2 seconds offset, and that was enough to allow the device to enroll.

@korylprince
Copy link
Contributor Author

To recap our discussion of this PR during office hours today:

  • The issue is not to do with devices generating their own certificates (in both the normal and OTA enrollments, the device uses a SCEP cert)
  • The issue is because the device is generating pkcs7 signatures for requests, and clock skew between the device and the server is causing the signatures to be invalid
  • Because the issue seems to occur rarely, the clock skew is now behind an adjustable (default off) flag

@korylprince
Copy link
Contributor Author

After doing some testing, I'm less convinced that a configurable skew (or skew at all) is needed.

Despite adding a manual skew of several hours in both directions on the server, MicroMDM seems to have no issue verifying requests on /mdm/checkin. It does seem like MicroMDM is using this modified time, as the DEP API doesn't like being on the wrong time. This differs from the results @tgunz was seeing, but I'm, as of yet, unable to reproduce the error. I'd like to eventually dump the pkcs7 data to see what actually comes in, but that's an exercise for another day.

For now, I suggest we merge this PR, which I've rebased to only contain the go-kit error logging fix, as we know that's definitely an issue and an easy fix.

I've kept the skew flag code on this branch. If this does end up becoming a visible issue now that the error logging is fixed, we can address those changes again.

@jessepeterson jessepeterson merged commit 1f93237 into micromdm:main Apr 10, 2023
@jessepeterson
Copy link
Member

Now that this is merged I came across #406 again fwiw — sort of related to errors being hidden and such. 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants