Skip to content

[DRAFT]: Generic reconciler#28276

Closed
joamaki wants to merge 12 commits intocilium:mainfrom
joamaki:pr/joamaki/gen-reconciler
Closed

[DRAFT]: Generic reconciler#28276
joamaki wants to merge 12 commits intocilium:mainfrom
joamaki:pr/joamaki/gen-reconciler

Conversation

@joamaki
Copy link
Copy Markdown
Contributor

@joamaki joamaki commented Sep 26, 2023

Builds on top of #28140.

This implements a "generic reconciler" which takes the desired state as a StateDB table
and a "reconciliation target" that defines the reconciliation operations. It performs
fast incremental reconciliation of changes and periodic full reconciliation. Metrics
and health reporting are included.

pkg/reconciler/example has some rough early examples for sysctl and iptables.

By wrapping the reconciliation into a reusable generic utility we gain:

  • Single well-tested implementation for mission critical code
  • Uniform metrics, health and status reporting
  • Significant reduction in code duplication
  • The "concept" of reconciler that can be understood and inspected from outside

Summary of the API:

package reconciler

// Reconcilable objects carry status
type Reconcilable[Obj comparable] interface {
  comparable
  GetStatus() Status
  WithStatus(Status) Obj
}

// Reconciliation target defines the operations for reconciling a single
// object.
type Target[Obj Reconcilable[Obj]] interface {
  Init(context.Context) error
  Update(context.Context, Obj) error
  Delete(context.Context, Obj) error
  Sync(context.Context, statedb.Iterator[Obj]) (outOfSync bool, err error)
}

type params[Obj Reconcilable[Obj]] struct {
  cell.In

  // ... other dependencies ...

  // Dependencies that parametrize the reconciler for the concrete
  // use case:
  Config
  Target Target[Obj]
  StatusIndex statedb.Index[Obj, StatusKind]
}

// Register creates a reconciler and adds it to the application lifecycle.
// "cell.Invoke(reconciler.Register[MyObject])".
func Register[Obj Reconcilable[Obj]](p params[Obj]) { ... }
  

Usage:

package example

// Define the object that presents the desired state.
type MyObject struct { ... }
func (m *MyObject) GetStatus() reconciler.Status { ... }
func (m *MyObject) WithStatus(s reconciler.Status) *MyObject { /* copy and set status */ }
var statusIndex = reconciler.NewStatusIndex[*MyObject]()
var idIndex = statedb.Index[*MyObject, UUID]{...}

// Define a safe API for managing the desired state. Alternatively if it's obvious
// the RWTable[*MyObject] could be directly exposed.
// (TODO: Not sure at all if we want to call these sort of things managers)
type MyObjectManager struct {...}
func (m *MyObjectManager) Set(...) error {}
func (m *MyObjectManager) Get(...) error {}
func (m *MyObjectManager) Wait(context.Context) error { 
  // see example/sysctl.go for how we can wait for reconciliation.
}
func newMyObjectManager(db *statedb.DB, table statedb.RWTable[*MyObject) *MyObjectManager { ... }

var Cell = cell.Module(
  "my-module", 
  "There are many like it, but this one is mine",

  // Create the table for the desired state. Keep write access protected to this module.
  statedb.NewProtectedTableCell[*MyObject]("my-objects", idIndex, statusIndex),

  // Provide a public API for managing the desired state to the rest of the application.
  cell.Provide(newMyObjectManager),

  // Provide the reconciler dependencies, but scoped to just this module.
  cell.ProvidePrivate(
    func() reconciler.Target[*MyObject] { return &myObjectTarget{} },
    func() reconciler.Config { ...},
    func() statedb.Index[*MyObject, reconciler.StatusKind] { return statusIndex },
  ),

  // Create and register the reconciler to the application lifecycle. 
  cell.Invoke(reconciler.Register[*MyObject]),
)

type myObjectTarget struct { ... }
func (r *myObjectTarget) Init(context.Context) error { ... }
func (r *myObjectTarget) Update(context.Context, Obj) error { ... }
func (r *myObjectTarget) Delete(context.Context, Obj) error { ... }
func (r *myObjectTarget) Sync(context.Context, statedb.Iterator[Obj]) (outOfSync bool, err error) { .. }

Health reporting:

Status{ModuleID: iptables, Level: Degraded, Since: 3.000499163s ago, Message: Incremental reconciliation failure, Err: oops iptables error}

Metrics:

# HELP cilium_reconciler_full_duration_seconds Histogram over full reconciliation duration
# TYPE cilium_reconciler_full_duration_seconds histogram
cilium_reconciler_full_duration_seconds_bucket{module_id="sysctl",le="0.005"} 1
# ...
cilium_reconciler_full_duration_seconds_sum{module_id="sysctl"} 6.0364e-05
cilium_reconciler_full_duration_seconds_count{module_id="sysctl"} 1
# HELP cilium_reconciler_full_total Number of full reconciliations performed
# TYPE cilium_reconciler_full_total counter
cilium_reconciler_full_total{module_id="sysctl"} 1
# HELP cilium_reconciler_incremental_duration_seconds Histogram of per-operation duration during incremental reconciliation
# TYPE cilium_reconciler_incremental_duration_seconds histogram
cilium_reconciler_incremental_duration_seconds_bucket{module_id="iptables",le="0.005"} 4
# ...
cilium_reconciler_incremental_duration_seconds_sum{module_id="iptables"} 9.6464e-05
cilium_reconciler_incremental_duration_seconds_count{module_id="iptables"} 4
cilium_reconciler_incremental_duration_seconds_bucket{module_id="sysctl",le="0.005"} 14
# ...
cilium_reconciler_incremental_duration_seconds_sum{module_id="sysctl"} 0.0009456850000000001
cilium_reconciler_incremental_duration_seconds_count{module_id="sysctl"} 14
# HELP cilium_reconciler_incremental_errors_current The number of objects currently failing to be reconciled
# TYPE cilium_reconciler_incremental_errors_current gauge
cilium_reconciler_incremental_errors_current{module_id="iptables"} 1
cilium_reconciler_incremental_errors_current{module_id="sysctl"} 0
# HELP cilium_reconciler_incremental_errors_total Total number of errors encountered during incremental reconciliation
# TYPE cilium_reconciler_incremental_errors_total counter
cilium_reconciler_incremental_errors_total{module_id="iptables"} 4
cilium_reconciler_incremental_errors_total{module_id="sysctl"} 4
# HELP cilium_reconciler_incremental_total Number of incremental reconciliations performed
# TYPE cilium_reconciler_incremental_total counter
cilium_reconciler_incremental_total{module_id="iptables"} 4
cilium_reconciler_incremental_total{module_id="sysctl"} 15

@maintainer-s-little-helper
Copy link
Copy Markdown

Commit 8372713aeee883bda04ba3e913b30b127b0c1794 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Sep 26, 2023
@joamaki joamaki force-pushed the pr/joamaki/gen-reconciler branch from 0568840 to 73333ae Compare September 26, 2023 08:23
@maintainer-s-little-helper
Copy link
Copy Markdown

Commit 8372713aeee883bda04ba3e913b30b127b0c1794 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@joamaki joamaki force-pushed the pr/joamaki/gen-reconciler branch from 73333ae to 7f6c412 Compare September 26, 2023 08:31
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Sep 26, 2023
@joamaki joamaki changed the title [DRAFT]: Generic reconcilers [DRAFT]: Generic reconciler Sep 26, 2023
To allow control over where a table can be modified, split the Table[Obj]
interface into reader (Table[Obj]) and writer (RWTable[Obj]) parts.
RWTable[Obj] is a superset of Table[Obj], and can thus be used everywhere
where a Table[Obj] is required.

For convenience, implement NewProtectedTableCell() to provide the
RWTable[Obj] only in the hive module's scope, and Table[Obj] globally.

With this, the cell defining the table can choose to either not allow modifications
at all to the table (tests can still go via NewTable() to create a modifiable table),
or only allow modifications under very controller and validated ways.

Signed-off-by: Jussi Maki <[email protected]>
Make RWTable[*Device] and RWTable[*Route] private to the devices
controller to make sure they're not modified by anyone else.

The DeviceTableCell and RouteTableCell are still available in the
tables package for tests to use.

Signed-off-by: Jussi Maki <[email protected]>
These allow efficiently doing optimistic concurrency control in
cases where a long-running computation is performed against the
data and then results written back to a table, e.g.

 1. With ReadTxn compute results for each object (with revision number)
 2. With WriteTxn commit the results using CompareAndDelete. If the
    object has changed, the stale result for that object is ignored.
 3. Wait for changes and repeat

Signed-off-by: Jussi Maki <[email protected]>

Refresh of statedb-split-write-methods
Benchmark the delay and throughput for propagating changes from one table
to another. This simulates the expected use-case for statedb where we would
be reading from one or more tables and computing desired state that is written
to another table.

Results on my machine:
  goos: linux
  goarch: amd64
  pkg: github.com/cilium/cilium/pkg/statedb
  cpu: AMD Ryzen 9 5950X 16-Core Processor
  BenchmarkDB_PropagationDelay 	 224287	     5071 ns/op
  --- BENCH: BenchmarkDB_PropagationDelay
      benchmarks_test.go:349: Propagation delay (N: 224287) 50p: 37.12µs, 90p: 91.633µs, 99p: 123.924µs

On a single core, 197k objects per second with batch size of 10. For batch size of 1
I got 61k objects per second, so even small increase in batch size oversets the
cost of waiting for the watch channel to close. Propagating the batch
took ~123 microseconds (99 percentile).

Signed-off-by: Jussi Maki <[email protected]>
Having access to the module identifier is useful for writing
generic cells that are meant to be embedded into a module.

This provides the module identifier as 'cell.ModuleId' scoped
to the module.

Signed-off-by: Jussi Maki <[email protected]>
Signed-off-by: Jussi Maki <[email protected]>
Signed-off-by: Jussi Maki <[email protected]>
@joamaki joamaki force-pushed the pr/joamaki/gen-reconciler branch from 9cea8ff to 5361786 Compare October 4, 2023 13:03
@joamaki joamaki force-pushed the pr/joamaki/gen-reconciler branch from c6f3bf7 to 550cfd8 Compare October 4, 2023 17:12
The device name hacks in DesiredRoute makes it impossible to compare actual
and desired routes. Instead resolve index in InsertLegacy/DeleteLegacy.

Signed-off-by: Jussi Maki <[email protected]>
Copy link
Copy Markdown
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed on a per-commit basis.

For some of my comments please leave the reply in the source code as well since people that will use the code examples as a base won't have the same questions.

Comment on lines +35 to +42
SysctlKeyIndex = statedb.Index[*SysctlSetting, string]{
Name: "key",
FromObject: func(s *SysctlSetting) index.KeySet {
return index.NewKeySet(index.String(s.Key))
},
FromKey: index.String,
Unique: true,
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an example, can we have comments for each field of this struct initialization for better understanding?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Also made me realize that the Index struct itself needs much better docs.

FromKey: index.String,
Unique: true,
}
SysctlStatusIndex = reconciler.NewStatusIndex[*SysctlSetting]()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a comment of why we need an "indexer"? Or why do we need this indexer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely! This is still a very messy draft so many things like that missing. I’ll make sure whatever the final ”canonical example” ends up being will include these.

Comment on lines +146 to +147
// afterwards. If the object has changed in the meanwhile the status update for
// the old object is skipped.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reconciliation status is per object does this mean that if the commit fails it will fail for all objects or just for the object that has changed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the object. CompareAndSwap does nothing if the revision doesn’t match


var err error
if status.Delete {
err = r.Target.Delete(obj)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment in this line stating that this is the business logic for "deletes". It was not obvious on a first sight.

updateResults[obj] = result{rev, StatusError(true, err)}
}
} else {
err = r.Target.Update(obj)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment in this line stating that this is the business logic for "updates". It was not obvious on a first sight.

}

func modifyIPTables(db *statedb.DB, t statedb.RWTable[*Rule]) {
go func() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a seperate go routine?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because one can’t block in Invoke. I’ll make this example proper by using pkg/hive/job for these.

Comment on lines +65 to +69
// TODO:
// - "iptables-save" and parse? Or "-C" on each rule?
// - Check that cilium chains exist
// - Find unexpected rules in cilium chains?
return false, nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was really curious to see this one 😅

Comment on lines +88 to +90
//
// Rule
//
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to define all types of "rules" "jump"? Can't the state be the full iptables command? For example, the desired state is -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE and the realized state is basically a execution of iptables -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE.

Copy link
Copy Markdown
Contributor Author

@joamaki joamaki Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I shouldn’t have left this very unfinished example in as there’s too many open questions in it..

var errs []error

for desired, _, ok := iter.Next(); ok; desired, _, ok = iter.Next() {
actual, _, ok := t.routes.First(txn, tables.RouteIDIndex.Query(desired.RouteID()))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "First"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we’re only interested in the first match. StateDB supports both unique and non-unique indexes and doesn’t differentiate the query API for these.

@@ -0,0 +1,419 @@
package main
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The routes reconciler seems to be the hardest to understand because I couldn't find the "reconciliation" logic to fetch the realized state and do the comparison between the desired and the realized state. The sysctl example was easier on this aspect.

Copy link
Copy Markdown
Contributor Author

@joamaki joamaki Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look again, it’s looking up the ”actual” route from the route table maintained by the devices controller (which pulls routes over netlink). This way it can efficiently reconcile a lot of routes without having to do individual syscalls to fetch routes one by one.

But indeed, it makes it harder to follow as it depends on this outside realized state table that’s populated elsewhere. The final version will need to document this well. Especially how it deals with acting on potentially stale ”realized state” data (that’s where full reconciliation is needed).

- Replace Sync with Update&Prune. This allows reusing Update() so most reconcilers
  don't have to implement things twice. Prune for most reconcilers is a no-op.
- Use string for error in Status as 'error' doesn't serialize. Likely should fix
  with a custom JSON marshalling instead, so that Status.Error can be used with e.g.
  errors.Is.
- Update desired object status also in full reconciliation

Signed-off-by: Jussi Maki <[email protected]>
@joamaki joamaki force-pushed the pr/joamaki/gen-reconciler branch from 550cfd8 to e88757c Compare October 13, 2023 09:57
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Nov 13, 2023
@github-actions
Copy link
Copy Markdown

This pull request has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants