Skip to content

felix: ensure vxlan udp flows are not tracked in conntrack#8977

Merged
caseydavenport merged 1 commit intoprojectcalico:masterfrom
cyclinder:vxlan_notrack
Jul 30, 2024
Merged

felix: ensure vxlan udp flows are not tracked in conntrack#8977
caseydavenport merged 1 commit intoprojectcalico:masterfrom
cyclinder:vxlan_notrack

Conversation

@cyclinder
Copy link
Copy Markdown
Contributor

@cyclinder cyclinder commented Jul 4, 2024

Description

felix: ensure vxlan udp flows are not tracked in conntrack

Related issues/PRs

fixes #8934

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Felix now arranges for VXLAN packets to skip netfilter conntrack. VXLAN uses pseudo random source ports so the "flows" are unidirectional and not meaningful to conntrack.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@cyclinder cyclinder requested a review from a team as a code owner July 4, 2024 04:02
@marvin-tigera marvin-tigera added this to the Calico v3.29.0 milestone Jul 4, 2024
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Jul 4, 2024
Copy link
Copy Markdown
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @cyclinder! I spotted one issue (there's a dedicated action for NOTRACK)

// we're policing.
}

// ensure VXLAN UDP Flows are not tracked in conntrack
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ensure VXLAN UDP Flows are not tracked in conntrack
// Ensure VXLAN UDP Flows are not tracked in conntrack.

@cyclinder cyclinder force-pushed the vxlan_notrack branch 2 times, most recently from fe77f9f to 44ad905 Compare July 5, 2024 06:44
@cyclinder
Copy link
Copy Markdown
Contributor Author

All the unit tests are green.

root@10-20-1-200:/home/cyclinder/calico# go test -v ./felix/rules
=== RUN   TestRules
Running Suite: Rules Suite
==========================
Random Seed: 1720161858
Will run 506 of 506 specs

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
JUnit report was created: /home/cyclinder/calico/felix/report/rules_suite.xml

Ran 506 of 506 Specs in 0.301 seconds
SUCCESS! -- 506 Passed | 0 Failed | 0 Pending | 0 Skipped

You're using deprecated Ginkgo functionality:
=============================================
Ginkgo 2.0 is under active development and will introduce several new features, improvements, and a small handful of breaking changes.
A release candidate for 2.0 is now available and 2.0 should GA in Fall 2021.  Please give the RC a try and send us feedback!
  - To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md
  - For instructions on using the Release Candidate visit https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#using-the-beta
  - To comment, chime in at https://github.com/onsi/ginkgo/issues/711

  You are using a custom reporter.  Support for custom reporters will likely be removed in V2.  Most users were using them to generate junit or teamcity reports and this functionality will be merged into the core reporter.  In addition, Ginkgo 2.0 will support emitting a JSON-formatted report that users can then manipulate to generate custom reports.

  If this change will be impactful to you please leave a comment on https://github.com/onsi/ginkgo/issues/711
  Learn more at: https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#removed-custom-reporters

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=1.16.5

--- PASS: TestRules (0.45s)
PASS
ok  	github.com/projectcalico/calico/felix/rules	0.496s

@lwr20
Copy link
Copy Markdown
Member

lwr20 commented Jul 5, 2024

/sem-approve

Copy link
Copy Markdown

@adkafka adkafka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks great to me, thanks for making the PR @cyclinder.

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 9, 2024

Tests are failing due to a typo in the iptables rule...

-dport

should be

--dport

}

func (m MatchCriteria) DestPort(port uint16) MatchCriteria {
return append(m, fmt.Sprintf("-dport %v", port))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return append(m, fmt.Sprintf("-dport %v", port))
return append(m, fmt.Sprintf("--dport %v", port))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! updated.

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 9, 2024

Sorry @cyclinder , a big refactor PR just landed to add nftables support to Felix so this will need to be rebased on top of that. The biggest change that'll affect you is that the match criteria builder is now abstracted to support iptables or nftables so you'll need to add the new port match to both.

@cyclinder
Copy link
Copy Markdown
Contributor Author

hi @fasaxc, I'm not sure we need to add such a vxlan notrack rule for nftables as well.

@adkafka
Copy link
Copy Markdown

adkafka commented Jul 10, 2024

hi @fasaxc, I'm not sure we need to add such a vxlan notrack rule for nftables as well.

I believe it is needed for nftables as well. nftables also uses conntrack under the hood based on their docs (https://wiki.nftables.org/wiki-nftables/index.php/Connection_Tracking_System).

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 11, 2024

Yes, nftables works the same way as iptables at least in our "v1". Later we may take advantage of some more advanced nftables features.

@cyclinder cyclinder force-pushed the vxlan_notrack branch 2 times, most recently from bda6d31 to 740089e Compare July 12, 2024 02:03
@cyclinder
Copy link
Copy Markdown
Contributor Author

Thanks for the explanation! I just fixed the conflicts, and all the tests have passed. please recheck it.

@cyclinder
Copy link
Copy Markdown
Contributor Author

Hi, Could you please take a look?

/cc @fasaxc @lwr20

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 22, 2024

/sem-approve

Copy link
Copy Markdown
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion. We should probably only write the rules if we're rendering the right IP tables version. For example,. user might have VXLAN v4 enabled, but not v6. In which case we don't want VXLAN rules for v6 since the user may be using VXLANv6 outside of Calico.

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 22, 2024

/sem-approve

@fasaxc
Copy link
Copy Markdown
Member

fasaxc commented Jul 22, 2024

LGTM, thanks @cyclinder !

@fasaxc fasaxc removed the docs-pr-required Change is not yet documented label Jul 22, 2024
@fasaxc fasaxc added the docs-not-required Docs not required for this change label Jul 22, 2024
@adkafka
Copy link
Copy Markdown

adkafka commented Jul 30, 2024

@fasaxc @cyclinder I'm excited to get this PR out; thanks both for the work! Anything blocking us from merging this PR? I'd love to have it in the next Calico release. My apologies if I'm jumping the gun here and missing some process. My understanding is that the next step is merging, based on these docs:

1. Once your PR has been approved and the commits have been squashed, your reviewer will merge the PR. If you have the necessary permissions, you may merge the PR yourself.

@caseydavenport caseydavenport merged commit a3667f8 into projectcalico:master Jul 30, 2024
BenoitKnecht added a commit to BenoitKnecht/cilium that referenced this pull request Apr 7, 2025
VXLAN traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN doesn't operate with stateful flows (i.e. nodes send traffic
to port 8472, they don't "reply" to packets based on the other node's
source port), so each entry is in the `UNREPLIED` state, making them
useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent VXLAN traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
BenoitKnecht added a commit to BenoitKnecht/cilium that referenced this pull request Apr 30, 2025
VXLAN traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN doesn't operate with stateful flows (i.e. nodes send traffic
to port 8472, they don't "reply" to packets based on the other node's
source port), so each entry is in the `UNREPLIED` state, making them
useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent VXLAN traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
BenoitKnecht added a commit to BenoitKnecht/cilium that referenced this pull request Aug 12, 2025
VXLAN traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN doesn't operate with stateful flows (i.e. nodes send traffic
to port 8472, they don't "reply" to packets based on the other node's
source port), so each entry is in the `UNREPLIED` state, making them
useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent VXLAN traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
joestringer pushed a commit to BenoitKnecht/cilium that referenced this pull request Aug 21, 2025
VXLAN traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN doesn't operate with stateful flows (i.e. nodes send traffic
to port 8472, they don't "reply" to packets based on the other node's
source port), so each entry is in the `UNREPLIED` state, making them
useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent VXLAN traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
BenoitKnecht added a commit to BenoitKnecht/cilium that referenced this pull request Aug 21, 2025
Tunnel traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN/Geneve don't operate with stateful flows (i.e. nodes send
traffic to port 8472/6081, they don't "reply" to packets based on the
other node's source port), so each entry is in the `UNREPLIED` state,
making them useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent tunnel traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request Aug 22, 2025
Tunnel traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN/Geneve don't operate with stateful flows (i.e. nodes send
traffic to port 8472/6081, they don't "reply" to packets based on the
other node's source port), so each entry is in the `UNREPLIED` state,
making them useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent tunnel traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
bersoare pushed a commit to bersoare/cilium that referenced this pull request Oct 27, 2025
Tunnel traffic tends to create a lot of `conntrack` entries (> 100k),
because each node needs to open connections to every node in the
cluster.

Also, VXLAN/Geneve don't operate with stateful flows (i.e. nodes send
traffic to port 8472/6081, they don't "reply" to packets based on the
other node's source port), so each entry is in the `UNREPLIED` state,
making them useless in the context of a stateful firewall.

This commit adds `iptables` rules to prevent tunnel traffic from being
added to `conntrack`. This doesn't remove any functionality, but avoids
overflowing `conntrack` with useless entries.

For what it's worth, Calico is doing the same in
projectcalico/calico#8977.

Signed-off-by: Benoît Knecht <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required Docs not required for this change release-note-required Change has user-facing impact (no matter how small)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Felix should configure iptables rules such that VXLAN UDP Flows are not tracked in conntrack, when in VXLAN mode

6 participants