Skip to content

Conversation

@ambarve
Copy link
Contributor

@ambarve ambarve commented Aug 17, 2021

Currently LCOW UVM doesn't run any time synchronization service which causes a small amount of time lag after running the UVM for a long time (~10 days).

This change updates OpenGCS to start chronyd daemon as soon as gcs is started. It also adds a new annotation which can be used to disable this service inside the UVM.

@ambarve ambarve requested a review from a team as a code owner August 17, 2021 22:54
@ambarve
Copy link
Contributor Author

ambarve commented Aug 17, 2021

Do not merge this before we make the required changes to use the updated LCOW drop.

ptpDevPath := filepath.Join("/dev", filepath.Base(ptpDirPath))
chronydConfigString := fmt.Sprintf("refclock PHC %s poll 3 dpoll -2 offset 0\n", ptpDevPath)

chronydConfPath := "/tmp/chronyd.conf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well have this as a global const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can hurt readability. If you're reading through code and see a global, you suddenly have to do extra work to figure out where it's being used. If it's only needed in this function then I would just keep it scoped to this function, though it could be moved to a const block in this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const in this function is fine with me. Global const in the communutils package might be a bit odd for this anyways, but truthfully maybe this entire function doesn't belong in communtils looking again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you had the same feedback here haha #1119 (comment)

Copy link
Contributor

@dcantah dcantah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

return errors.Wrap(err, "start chronyd command failed")
}
go func() {
waitErr := chronydCmd.Wait()
Copy link
Member

@kevpar kevpar Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle cases like restarting this if it crashes or is OOM killed?

@kevpar
Copy link
Member

kevpar commented Aug 18, 2021

Do not merge this before we make the required changes to use the updated LCOW drop.

In the future I think we can use a draft PR for cases like this.

@kevpar
Copy link
Member

kevpar commented Aug 18, 2021

I wonder if there's any way we could run chronyd as a privileged infra container, and avoid needing to do this extra work in the gcs?

@ambarve ambarve marked this pull request as draft August 25, 2021 18:04
@ambarve
Copy link
Contributor Author

ambarve commented Oct 26, 2021

@kevpar can you PTAL? (LCOW drop isn't ready yet but it would be good to get this ready to merge by the time we have it)

@ambarve ambarve marked this pull request as ready for review October 26, 2021 04:49
Changes to the opengcs to start the chronyd service after UVM boots.

Signed-off-by: Amit Barve <[email protected]>
Add test to verify both chronyd running & disabled cases. Minor fixes in chronyd startup
code.

Signed-off-by: Amit Barve <[email protected]>
Signed-off-by: Amit Barve <[email protected]>
Signed-off-by: Amit Barve <[email protected]>
cmd/gcs/main.go Outdated
}
// Expected clock name is `hyperv` so read first 6 chars and verify the
// name
buf := make([]byte, len(expectedClockName))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is hyperv the entire contents of the clock_name file? If so may be easier to just use ioutil.ReadFile here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The file we are looking for has only "hyperv" in it. But we loop over a list of directories and check the contents of a file named clock_name in each of those. We don't know what are the contents of other files. Doing ioutil.ReadFile on such files could take too much time and/or memory for no reason. That's why I followed this approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems okay, though if there is ever a new clock device added with clock_name of hyperv_newdevicedontuse then we could break. Maybe we should check the file size is what we expect too?

My suggestion to use ioutil.ReadFile was based on an assumption that clock names are probably going to be short. If we think that's safe to rely on then we could still take this approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed this to use ioutil.ReadFile now. There is a small catch though, we now must also look for a new line at the end of the clock name. I have updated the expectedClockName accordingly.

Signed-off-by: Amit Barve <[email protected]>
Copy link
Member

@kevpar kevpar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@dcantah dcantah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ambarve ambarve merged commit db9908f into microsoft:master Nov 12, 2021
princepereira pushed a commit to princepereira/hcsshim that referenced this pull request Aug 29, 2024
Start time synchronization service in opengcs

Changes to the opengcs to start the chronyd service after UVM boots.

Signed-off-by: Amit Barve <[email protected]>

* 

Signed-off-by: Amit Barve <[email protected]>

* TimeSync service inside LCOW UVM.

Add test to verify both chronyd running & disabled cases. Minor fixes in chronyd startup
code.

Signed-off-by: Amit Barve <[email protected]>

* Run Chronyd with restart monitor

Signed-off-by: Amit Barve <[email protected]>

* Force chronyd to step update time if difference is big

Signed-off-by: Amit Barve <[email protected]>

* Fixes after rebase

Signed-off-by: Amit Barve <[email protected]>

* go mod vendor & tidy

Signed-off-by: Amit Barve <[email protected]>

* Use backoff package instead of manually calculating backoffs

Signed-off-by: Amit Barve <[email protected]>

* Rename gcs cmdline params, use io.ReadFull instead of io.Read

Minor other fixes.

Signed-off-by: Amit Barve <[email protected]>

* go mod vendor

Signed-off-by: Amit Barve <[email protected]>

* Ignore err if file doesn't exist

Signed-off-by: Amit Barve <[email protected]>

* Use ioutil.ReadFile to read clock_name file

Signed-off-by: Amit Barve <[email protected]>

* minor fix

Signed-off-by: Amit Barve <[email protected]>

* Remove incorrect usage of backoff.MaxElapsedTime

Signed-off-by: Amit Barve <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants