Skip to content

nftables support for ipmasq and portmap#935

Merged
squeed merged 5 commits intocontainernetworking:mainfrom
danwinship:nftables
Sep 16, 2024
Merged

nftables support for ipmasq and portmap#935
squeed merged 5 commits intocontainernetworking:mainfrom
danwinship:nftables

Conversation

@danwinship
Copy link
Copy Markdown
Contributor

This adds nftables backends to the two remaining iptables-only bits in containernetworking/plugins; ipmasq (used by ptp and bridge) and portmap. For now, they are both set up to use iptables by default, unless you explicitly request nftables via the network config, or you run the plugin on a host that has nft installed but not iptables (eg, future RHEL 10 hosts).

The firewall plugin currently has "iptables" and "firewalld" backends. In theory it could have a separate explicit "nftables" backend, but I didn't do that here.

The ipmasq and portmap code implement nftables via https://github.com/danwinship/nftables, which is a library I started for the planned kube-proxy nftables backend, and also plan to use for various other nftables ports (eg, cri-o). It might move from danwinship/ to somewhere more generic at some point, but it might not.

After I started working on this branch, I discovered that the bridge plugin was actually already using nftables via pkg/link/spoofcheck.go, using the https://github.com/networkplumbing/go-nft library. Using two separate nftables libraries is probably not great, so we may want to fix that eventually. The libraries have somewhat similar APIs up to a point, then diverge completely because go-nft uses the JSON API to /sbin/nft and therefore has to use the (poorly-documented) JSON rule schema, while danwinship/nftables uses the plain text API to /sbin/nft and so uses "normal" nft rules. eg, compare:

func (sc *SpoofChecker) matchIfaceJumpToChainRule(chain, toChain string) *schema.Rule {
        return &schema.Rule{
                Family: schema.FamilyBridge,
                Table:  natTableName,
                Chain:  chain,
                Expr: []schema.Statement{
                        {Match: &schema.Match{
                                Op:    schema.OperEQ,
                                Left:  schema.Expression{RowData: []byte(`{"meta":{"key":"iifname"}}`)
},
                                Right: schema.Expression{String: &sc.iface},
                        }},
                        {Verdict: schema.Verdict{Jump: &schema.ToTarget{Target: toChain}}},
                },
                Comment: ruleComment(sc.refID),
        }
}

which in danwinship/nftables syntax would become:

func (sc *SpoofChecker) matchIfaceJumpToChainRule(chain, toChain string) *nftables.Rule {
        return &nftables.Rule{
                Chain: chain,
                Rule: nftables.Concat(
                        "iifname", sc.iface,
                        "jump", toChain,
                ),
                Comment: ruleComment(sc.refID),
        }
}

("Draft" PR because not fully-tested yet...)

@danwinship danwinship force-pushed the nftables branch 2 times, most recently from d6e2a46 to 60eb4e3 Compare July 31, 2023 14:18
@danwinship
Copy link
Copy Markdown
Contributor Author

  [FAILED] Unexpected error:
      <*errors.errorString | 0xc0003bdb20>: 
      running [/usr/sbin/iptables -t nat -D POSTROUTING -s 10.1.2.2 -j CNI-43a5a67926c1a665ff4c21b7 -m comment --comment name: "testConfig" id: "dummy-0" --wait]: exit status 2: iptables v1.8.7 (nf_tables): Chain 'CNI-43a5a67926c1a665ff4c21b7' does not exist

sigh, this is what coreos/go-iptables#107 was supposed to fix but I guess I fixed it for iptables-legacy but not iptables-nft?

(Was this not failing before? I'm not sure why this change would make the race condition suddenly start happening.)

Comment thread pkg/utils/netfilter.go Outdated
Comment thread pkg/utils/netfilter.go
Comment thread pkg/ip/ipmasq_linux.go Outdated
// SetupIPMasq installs iptables rules to masquerade traffic
// coming from ip of ipn and going outside of ipn
func SetupIPMasq(ipn *net.IPNet, chain string, comment string) error {
func SetupIPMasq(ipn *net.IPNet, pluginName, containerID string) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right. Do I have to add a new API rather than changing the old one? I don't see anything about API stability guarantees...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not familiar with this codebase , let's ask @squeed

@danwinship danwinship force-pushed the nftables branch 2 times, most recently from c69144c to e48682c Compare May 25, 2024 12:45
@aojea
Copy link
Copy Markdown
Contributor

aojea commented May 27, 2024

linter fail, I see is still in draft, let me know if you want a review,

@danwinship danwinship marked this pull request as ready for review July 24, 2024 15:02
@danwinship
Copy link
Copy Markdown
Contributor Author

OK, finally updated and tested:

  • ip.SetupIPMasq() and ip.TeardownIPMasq() are now unchanged (and still always use iptables). Instead, I added a new SetupIPMasqForPlugin and TeardownIPMasqForPlugin that support both iptables and nftables.
  • The bridge and ptp plugins now use iptables by default, or nftables if iptables isn't installed or doesn't work. Or you can force the backend via "ipMasqBackend": "iptables" or "ipMasqBackend": "nftables" in the config.
  • Likewise, portmap now supports nftables, but still uses iptables by default, unless it's not available/usable, or you pass "backend": "nftables", or you set "conditionsV4" or "conditionsV6" to something that looks like nftables syntax rather than iptables.

(I guess this will need a docs update to the cni.dev repo?)

@danwinship
Copy link
Copy Markdown
Contributor Author

/assign @squeed

@danwinship
Copy link
Copy Markdown
Contributor Author

oh, and I could port this to use the same nftables library that spoofcheck uses, if you preferred.

or port the spoofcheck code to use knftables like kube-proxy (and Calico and soon some other stuff)

Comment thread go.mod Outdated
@squeed squeed self-requested a review August 19, 2024 15:07
@danwinship
Copy link
Copy Markdown
Contributor Author

danwinship commented Aug 26, 2024

or port the spoofcheck code to use knftables like kube-proxy

danwinship@17bb5355

Comment thread go.mod Outdated
Comment thread pkg/ip/ipmasq_nftables_linux.go Outdated

// hashForContainer returns a unique hash identifying the rules for this container with
// this plugin
func hashForContainer(pluginName, containerID string) string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an error in the iptables implementation of this -- this needs to hash (network, ifname, containerid).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pluginName here is NetConf.Name, which is what you mean by network, right? (Yeah, pluginName is a bad variable name... I was assuming it meant NetConf.Type...)

Why do we need ifname?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need ifname?

Because it is allowed to add and remove the same container to the same network multiple times. In that case, the ifname is the only distinct identifier.

Comment thread pkg/ip/ipmasq_nftables_linux.go Outdated
}

func setupIPMasqNFTablesWithInterface(nft knftables.Interface, ipn *net.IPNet, pluginName, containerID string) error {
comment := hashForContainer(pluginName, containerID)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iptables implementation is wrong here -- we need to be able to find all masquerade rules for a given network. Can the comment include network name in plain text?

@squeed
Copy link
Copy Markdown
Member

squeed commented Aug 27, 2024

Looks basically good. I'd like to see GC support if possible, since it informs the design the API. Could you add that, even if the top-end API glue isn't there yet.

@danwinship danwinship force-pushed the nftables branch 2 times, most recently from a833549 to 043253e Compare August 29, 2024 13:49
Comment thread plugins/meta/portmap/main.go
Comment thread plugins/meta/portmap/portmap_test.go Outdated
Use `conditionsV4` and `conditionsV6` values that actually look like
valid iptables conditions.

Signed-off-by: Dan Winship <[email protected]>
@danwinship
Copy link
Copy Markdown
Contributor Author

@squeed this is ready for re-review btw; it includes an implementation of GC for nftables ipmasq, which isn't actually plumbed through, but at least gets unit tested.

@champtar
Copy link
Copy Markdown
Contributor

champtar commented Nov 6, 2024

I've started to play with this, found a regression (#1114 / PR #1117), and 2 bugs ( #1115 - PR #1116 / #1118 - PR #1120 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants