Skip to content

*: introduce wrapper pkgs for blockio and rdt#8066

Merged
fuweid merged 2 commits intocontainerd:mainfrom
fuweid:cleanup-blockio-init
Feb 13, 2023
Merged

*: introduce wrapper pkgs for blockio and rdt#8066
fuweid merged 2 commits intocontainerd:mainfrom
fuweid:cleanup-blockio-init

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Feb 8, 2023

Before this patch, both the RdtEnabled and BlockIOEnabled are provided by services/tasks pkg. Since the services/tasks can be pkg plugin which can be initialized multiple times or concurrently. It will fire data-race issue as there is no mutex to protect enable.

This patch is aimed to provide wrapper pkgs to use intel/{blockio,rdt} safely.

Signed-off-by: Wei Fu [email protected]

@fuweid fuweid force-pushed the cleanup-blockio-init branch from 852b448 to c12431e Compare February 8, 2023 14:49
}

// SetConfig updates blockio config with a given config path.
func SetConfig(configFilePath string) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit weird to live in a separate location from the config.

  • IIUC, the mutex is to protect concurrency on the configFilePath.
  • That path is defined in a services/tasks.Config
  • But updating it now uses a package level mutex

Would it be better to have accessors on services/tasks.Config, which can handle concurrency (and in that case concurrency (per config, or globally?)) Perhaps something like https://github.com/moby/locker (or equivalent) if concurrency needs to be "per config file")

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

The issue I run into is that the test-cases use plugin-init to setup memory-level containerd instances and it will be data-race issue when they update the blockio/rdt configs in the same time.

result := p.Init(initContext)

IIUC, the mutex is to protect concurrency on the configFilePath.

I think it is not for the configFilePath. For example, the SetConfigFromFile will update a package-level var classBlockIO.

// https://github.com/intel/goresctrl/blob/cf1db7c9cfe529d718909767ebb3f0686f82d10f/pkg/blockio/blockio.go#L143-L145
// classBlockIO connects user-defined block I/O classes to
// corresponding cgroups blockio controller parameters.
var classBlockIO = map[string]cgroups.BlockIOParameters{}

Perhaps something like https://github.com/moby/locker (or equivalent) if concurrency needs to be "per config file")

Yes. I was thinking about that. The github.com/intel/goresctrl/pkg/blockio package doesn't allow us to handle it per config filepath, because it is using package-level var 😞

That is why I use package-level mutex to ensure that the config-update is safe and enable-accessor is routine-safe. Maybe we can file proposal to upstream to use individual object instead of package-level var. And then we use moby/locker to protect concurrency on individual config.

WDYT?

cc @klihub @mikebrow

Copy link
Copy Markdown
Member

@klihub klihub Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @askervin @marquiz PTAL.

Copy link
Copy Markdown
Member

@klihub klihub Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fuweid I try to paraphrase the problem to see if I understood it correctly: Currently both the RDT and blockio bits in goresctrl implicitly assume that there is only a single configuration active in any process using goresctrl. They also expect that process to take care of any configuration update vs. usage serialization. This is normally true, but the containerd integration test setup where multiple test cases run independently and unaware of each other (currently) violates these assumptions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klihub

Yes. If containerd instance is the standalone process, the configure file is only one and {blockio,rdt}/SetConfigFromFile is only called once, because there is only one services/task plugin instance.

But the containerd repo also provides library mode. With library mode, there maybe several services/task plugin instances, for instance, TestCRIImagePullTimeout cases.

for _, p := range loadedPlugins {
initContext := plugin.NewContext(
ctx,
p,
initialized,
config.Root,
config.State,
)
initContext.Events = events
// load the plugin specific configuration if it is provided
if p.Config != nil {
pc, err := config.Decode(p)
assert.NoError(t, err)
initContext.Config = pc
}
result := p.Init(initContext)
assert.NoError(t, initialized.Add(result))
_, err := result.Instance()
assert.NoError(t, err)
lastInitContext = initContext
}

Since there are two cases running in the parallel mode, the following initBlockIO/initRdt functions will be called in two goroutines. The enabled var will be accessed by two goroutines and test will raise data-race issue.

func initBlockIO(configFilePath string) error {
blockIOEnabled = false
if configFilePath == "" {
log.L.Debug("No blockio config file specified, blockio not configured")
return nil
}
if err := blockio.SetConfigFromFile(configFilePath, true); err != nil {
return fmt.Errorf("blockio not enabled: %w", err)
}
blockIOEnabled = true
return nil
}

For the safety reason, I use wrapper package to provide update-config function and it is singleton pattern.
It doesn't break existing functionality.

Before this patch, both the RdtEnabled and BlockIOEnabled are provided
by services/tasks pkg. Since the services/tasks can be pkg plugin which
can be initialized multiple times or concurrently. It will fire data-race
issue as there is no mutex to protect `enable`.

This patch is aimed to provide wrapper pkgs to use intel/{blockio,rdt}
safely.

Signed-off-by: Wei Fu <[email protected]>
@fuweid fuweid force-pushed the cleanup-blockio-init branch from c12431e to 62df35d Compare February 10, 2023 00:22
@fuweid fuweid requested review from klihub and thaJeztah February 10, 2023 00:25
@k8s-ci-robot
Copy link
Copy Markdown

@klihub: GitHub didn't allow me to request PR reviews from the following users: askervin, marquiz.

Note that only containerd members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @askervin @marquiz

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Copy Markdown
Member

@mxpv mxpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, not a blocker.
LGTM

Comment thread pkg/blockio/blockio_default.go
@dmcgowan dmcgowan added this to the 1.7 milestone Feb 12, 2023
@fuweid fuweid merged commit 2654ece into containerd:main Feb 13, 2023
@fuweid fuweid deleted the cleanup-blockio-init branch February 13, 2023 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants