Skip to content

Enhance supported patterns to include ** matching multilevel of subdomains#28690

Closed
pjablonski123 wants to merge 1 commit intocilium:mainfrom
pjablonski123:pj-fqdn
Closed

Enhance supported patterns to include ** matching multilevel of subdomains#28690
pjablonski123 wants to merge 1 commit intocilium:mainfrom
pjablonski123:pj-fqdn

Conversation

@pjablonski123
Copy link
Copy Markdown

@pjablonski123 pjablonski123 commented Oct 18, 2023

Adding ** to match substrings as a prefix including subdomains.

So far subdomains required a definition of all subdomains by adding a chain of star and dot.

Example:
rules:
  dns:
    - matchPattern: '*.cluster.local'
    - matchPattern: '*.*.cluster.local'
    - matchPattern: '*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.internal'
    - matchPattern: '*.*.*.*.internal'
    - matchPattern: '*.*.*.*.*.internal'
    - matchPattern: '*.*.*.*.*.*.internal'
    - matchPattern: '*.*.*.*.*.*.*.internal'
    - matchPattern: '*.*.*.*.*.*.*.*.internal'
    - matchPattern: '*.*.*.*.*.*.*.*.*.internal'
    - matchPattern: '*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.*.*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.*.*.*.*.internal.cloudapp.net'
    - matchPattern: '*.*.*.*.*.*.*.*.internal.cloudapp.net'

A feature enhancement introduces wildcards where ** would represent 1 or more DNS subdomains. That would allow them to compress the above policy to:

rules:
  dns:
    - matchPattern: '**.cluster.local'
    - matchPattern: '**.internal'
    - matchPattern: '**.internal.cloudapp.net'

Th code modification does not change the old behaviour if someone wants to list subdomains explicitly.

For example the policy:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: fqdn-policy
spec:
  endpointSelector: {}
  egress:
    - toFQDNs:
      - matchPattern: "*pl"
      - matchPattern: "*.io"
      - matchPattern: "**com"
      - matchPattern: "**.amazonaws.com"
    - toPorts:
      - ports:
         - port: "53"
           protocol: UDP
        rules:
          dns:
            - matchPattern: "*"
         
produces the following result:
Forbids domain.pl as expected by the old behaviour.
Allows cilium.io as expected by the old behaviour.
Allows google.com as expected by a new behaviour.
Allows elasticbeanstalk-eu-west-1-123456789000.s3-eu-west-1.amazonaws.com as expected by a new behaviour.

Release note:

Enhance supported patterns to include ** matching multilevel of subdomains.

Instead of the rules with listed subdomains:

rules:
  dns:
    - matchPattern: '*.cluster.local'
    - matchPattern: '*.*.cluster.local'
    - matchPattern: '*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.*.cluster.local'

the following policy can be used:

rules:
  dns:
    - matchPattern: '**.cluster.local'
 

Signed-off-by: Piotr Jablonski [email protected]

Fixes: #22081

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 18, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Oct 18, 2023
@pjablonski123 pjablonski123 marked this pull request as ready for review October 19, 2023 06:08
@pjablonski123 pjablonski123 requested review from a team as code owners October 19, 2023 06:08
Copy link
Copy Markdown
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the code but I just wanted to reshare the concern from the issue which we need to convince ourselves is OK: #22081 (comment)

@pjablonski123
Copy link
Copy Markdown
Author

pjablonski123 commented Oct 20, 2023

Thank you @joestringer for matching this PR with a previous considerations which should be included here.

Regarding a subdomain hijacking this PR does not make it easier comparing to the *. (asterisk dot) notation. Typically hijacked subdomains are at the first subdomain level of the FQDN: [subdomain]. [domainname]. [tld].

Both following rules:

  • matchPattern: "*.example.com"
  • matchPattern: "**.example.com"
    contribute in the same way to hijacking of subdomain.example.com.

If it is required to block other-level subdomains then the current notation "*.example.com" can be used.
In cases where more subdomains must be allowed then the chain of *. must be added allowing the attack anyway.

FQDNs can be attacked in any case and a proper protection should be provided by e.g. auth codes.

@joestringer
Copy link
Copy Markdown
Member

After security review with @ferozsalam , we concluded that this does slightly increase the risk of user clusters becoming compromised in cases where target domains are managed by a third party and particularly in cases where the third party delegates the responsibility for DNS subdomains across multiple administrators within that organization. It's difficult for someone writing a firewall rule to estimate that risk. That said, we also came to the conclusion that limiting the matching feature to a single level of subdomains is probably not the right way to mitigate the risk.

We plan on coming up with some improved guidance for the feature in order to ensure users have the opportunity to be properly educated regarding the security impact of this feature.

With that, I think that the next step is review and testing. I don't see any tests being proposed here, can you add some unit tests to validate the new regexes? Some of the corner cases might be things like proving that the . inside the new regex only matches the . character and not arbitrary characters, as well as checking that multiple levels of domains are matched as expected, as well as cases like the use of a regex that only has the content **. (Probably this makes no sense, allowing ** can be better expressed in other ways. Should we reject that case?) Perhaps you have some other corner cases in mind as well that would be worth validating.

@joestringer
Copy link
Copy Markdown
Member

CI also complained about unformatted code:

./pkg/fqdn/matchpattern/matchpattern.go
diff ./pkg/fqdn/matchpattern/matchpattern.go.orig ./pkg/fqdn/matchpattern/matchpattern.go
--- ./pkg/fqdn/matchpattern/matchpattern.go.orig
+++ ./pkg/fqdn/matchpattern/matchpattern.go
@@ -102,14 +102,14 @@
 	// base case. "." becomes a literal .
 	pattern = strings.Replace(pattern, ".", "[.]", -1)
 
-        // ** becomes [-a-zA-Z0-9_.]#, # to bypass the next line
+	// ** becomes [-a-zA-Z0-9_.]#, # to bypass the next line
 	pattern = strings.Replace(pattern, "**", enhancedDNSCharsREGroup+"#", -1)
-	
+
 	// base case. * becomes .*, but only for DNS valid characters
 	// NOTE: this only works because the case above does not leave the *
 	pattern = strings.Replace(pattern, "*", allowedDNSCharsREGroup+"*", -1)
 
-        // # becomes *
+	// # becomes *
 	pattern = strings.Replace(pattern, "#", "*", -1)
 	return pattern
 }
make: *** [Makefile:786: precheck] Error 1
make: *** Waiting for unfinished jobs....

You may consider using gofmt, either directly or integrated into your editor, to catch such issues. make precheck locally should reproduce the report from CI.

@joestringer joestringer added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. labels Oct 24, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Oct 24, 2023
@maintainer-s-little-helper
Copy link
Copy Markdown

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Oct 25, 2023
@pjablonski123
Copy link
Copy Markdown
Author

Added and run unit tests.

pj->matchpattern:~$ go test -v
=== RUN   Test
=== RUN   Test/MatchPatternTestSuite
=== RUN   Test/MatchPatternTestSuite/TestAnchoredMatchPatternMatching
=== RUN   Test/MatchPatternTestSuite/TestAnchoredMatchPatternREConversion
=== RUN   Test/MatchPatternTestSuite/TestMatchPatternSanitize
=== RUN   Test/MatchPatternTestSuite/TestUnAnchoredMatchPatternREConversion
--- PASS: Test (0.00s)
    --- PASS: Test/MatchPatternTestSuite (0.00s)
        --- PASS: Test/MatchPatternTestSuite/TestAnchoredMatchPatternMatching (0.00s)
        --- PASS: Test/MatchPatternTestSuite/TestAnchoredMatchPatternREConversion (0.00s)
        --- PASS: Test/MatchPatternTestSuite/TestMatchPatternSanitize (0.00s)
        --- PASS: Test/MatchPatternTestSuite/TestUnAnchoredMatchPatternREConversion (0.00s)
PASS
ok  	github.com/cilium/cilium/pkg/fqdn/matchpattern	0.648s

@pjablonski123 pjablonski123 requested review from a team as code owners October 25, 2023 14:23
@tklauser
Copy link
Copy Markdown
Member

tklauser commented Dec 1, 2023

EDIT: In response to your edit - I am pretty strongly opposed to adding underspecified features to the policy - 'just' documenting it won't convince me to ack this, but happy to be overruled by a maintainer or more senior engineers.

I agree and I think this feature warrants wider discussion based on a well-specified proposal, e.g. in form of a CFP. Instead of going back and forth on the specifics of the regex that is used to implement the check, I think we should take a step back here and write up a proposal with the intended behavior of the new ** pattern and consider potential backwards-compatibility issues with the existing * patterns. That proposal could then be discussed in the community meeting and/or the #sig-policy Slack channel.

The quote you are referring to is related to the initial code where there is a case of **cilium.io. We agreed with @bimmlerd in this thread to only allow** at the beginning of the domain name. Which is reflected in the code now.

In my statement, I wasn't referring to any particular revision of the code. I still think this is a change that warrants wider discussion and documentation (e.g. also in terms of security implications) in a structured document.

If in the previous Cilium releases two or more asterisks were allowed it means that unsupported syntax was not filtered out which is done now.

Isn't this still a breaking change if two or more asterisks were allowed previously and now it is rejected? Even if it was wrong and unsupported it might break users' existing CNPs which IMO warrants further discussion in terms of how this should be communicated to users.

@pjablonski123
Copy link
Copy Markdown
Author

pjablonski123 commented Dec 1, 2023

In my statement, I wasn't referring to any particular revision of the code. I still think this is a change that warrants wider discussion and documentation (e.g. also in terms of security implications) in a structured document.

Documents are always welcome, but this can be delivered in a form of a separate blog or a doc entry. There are quite many features which are more complex and described in a concise form like here.

Isn't this still a breaking change if two or more asterisks were allowed previously and now it is rejected? Even if it was wrong and unsupported it might break users' existing CNPs which IMO warrants further discussion in terms of how this should be communicated to users.

I did not see any single support ticket with **, which possibly reflects the current usage. Discussions are always welcome. :)

Even the simplest from with ** at the beginning of the FQDN can be argued as a breaking change. Simply it will allow more subdomains and someone can say: "Hey, my expectations so far were that **.cilium.io allows only one subdomain, no more and you introduced a new feature which breaks my security." That's a false statement as ** was not intended for a use as of now.

@tklauser
Copy link
Copy Markdown
Member

tklauser commented Dec 1, 2023

In my statement, I wasn't referring to any particular revision of the code. I still think this is a change that warrants wider discussion and documentation (e.g. also in terms of security implications) in a structured document.

Documents are always welcome, but this can be delivered in a form of a separate blog or a doc entry. There are quite many features which are more complex and described in a concise form like here.

I see you've added a section to the docs discussing some of the security implications. In addition, I think this PR also needs a dedicated release note entry other than the PR title in the PR description mentioning the user-facing implications of this change. See item 12 here in the contribution guide on how a good release note should look.

Isn't this still a breaking change if two or more asterisks were allowed previously and now it is rejected? Even if it was wrong and unsupported it might break users' existing CNPs which IMO warrants further discussion in terms of how this should be communicated to users.

I did not see any single support ticket with **, which possibly reflects the current usage. Discussions are always welcome. :)

Just because there aren't any reported issues with ** doesn't mean that any users would (mistakenly) already use them. I think we at the very least would need to clearly document in the upgrade notes that existing CNPs using ** might break. This bit of documentation should be part of this PR.

Even the simplest from with ** at the beginning of the FQDN can be argued as a breaking change. Simply it will allow more subdomains and someone can say: "Hey, my expectations so far were that **.cilium.io allows only one subdomain, no more and you introduced a new feature which breaks my security." That's a false statement as ** was not intended for a use as of now.

It's still a breaking change, no matter how "wrong" the users were to already using ** to match a single subdomain. As mentioned above, at least it needs to be documented clearly that the behavior would change for these users.

@pjablonski123
Copy link
Copy Markdown
Author

pjablonski123 commented Dec 1, 2023

Just because there aren't any reported issues with ** doesn't mean that any users would (mistakenly) already use them.

Agree, that's why I mentioned that it is not a definite proof. It is a probability.

It's still a breaking change, no matter how "wrong" the users were to already using ** to match a single subdomain.

OK, so then no matter what we do this enhancement will be a breaking change.

To summarize what is required, I will add:

  1. More information how this feature affects the policy, including possible break if there is ** inside. This will meet the following request.

I think we at the very least would need to clearly document in the upgrade notes that existing CNPs using ** might break. This bit of documentation should be part of this PR.
As mentioned above, at least it needs to be documented clearly that the behavior would change for these users.

  1. A release entry with the user-facing implications. This will meet the following request.

I see you've added a section to the docs discussing some of the security implications. In addition, I think this PR also needs a dedicated release note entry other than the PR title in the PR description mentioning the user-facing implications of this change. See item 12 here in the contribution guide on how a good release note should look.

Please let me know if anything else is required. Thank you for a review!

@tklauser
Copy link
Copy Markdown
Member

tklauser commented Dec 1, 2023

Please let me know if anything else is required. Thank you for a review!

Sounds good to me. Ideally we'd also cover this feature in a cilium-cli connectivity test as a follow-up.

@tklauser
Copy link
Copy Markdown
Member

tklauser commented Dec 8, 2023

To summarize what is required, I will add:

  1. More information how this feature affects the policy, including possible break if there
    [...]
  2. A release entry with the user-facing implications. This will meet the following request.

@pjablonski123 did you have a chance yet to make these changes?

@pjablonski123
Copy link
Copy Markdown
Author

To summarize what is required, I will add:

  1. More information how this feature affects the policy, including possible break if there
    [...]
  2. A release entry with the user-facing implications. This will meet the following request.

@pjablonski123 did you have a chance yet to make these changes?

This week was heavy with high priority tasks. I hope to finish it early next week. Unless there is a specific deadline over the weekend.

@maintainer-s-little-helper
Copy link
Copy Markdown

@maintainer-s-little-helper
Copy link
Copy Markdown

1 similar comment
@maintainer-s-little-helper
Copy link
Copy Markdown

@pjablonski123
Copy link
Copy Markdown
Author

/test

@pjablonski123
Copy link
Copy Markdown
Author

@tklauser Added a release note and a word to the wordlist, extended a doc description + all previous changes.

@pjablonski123
Copy link
Copy Markdown
Author

/test

@gandro
Copy link
Copy Markdown
Member

gandro commented Dec 19, 2023

Removing review requests from non-required codeowners (such as e.g. wireguard)

Copy link
Copy Markdown
Contributor

@gentoo-root gentoo-root left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, assuming that folks agreed on the semantics of "**xyz".

@tklauser
Copy link
Copy Markdown
Member

tklauser commented Jan 11, 2024

@tklauser Added a release note and a word to the wordlist, extended a doc description + all previous changes.

@pjablonski123 Thanks. The doc description LGTM but I don't see a release note an upgrade note being added by the PR yet. Maybe you forgot to add/push that change?

…mains.

pkg/fqdn: Adding ** matchpattern

The ** in matchpattern allows a short notation of multilevel subdomains

Instead of the rules with listed subdomains:

rules:
  dns:
    - matchPattern: '*.cluster.local'
    - matchPattern: '*.*.cluster.local'
    - matchPattern: '*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.cluster.local'
    - matchPattern: '*.*.*.*.*.*.*.*.cluster.local'

egress:
  - toFQDNs:
    - matchPattern: '*.cilium.io'
    - matchPattern: '*.*.cilium.io'
    - matchPattern: '*.*.*.cilium.io'
    - matchPattern: '*.*.*.*.cilium.io'
    - matchPattern: '*.*.*.*.*.cilium.io'
    - matchPattern: '*.*.*.*.*.*.cilium.io'
    - matchPattern: '*.*.*.*.*.*.*.cilium.io'
    - matchPattern: '*.*.*.*.*.*.*.*.cilium.io'

the following policy can be used in a simpler form:

rules:
  dns:
    - matchPattern: '**.cluster.local'

egress:
  - toFQDNs:
    - matchPattern: '**.cilium.io'

Changelog:
Modified the code and added tests.
Updated language.rst.
Updated matchpattern.go.
Added and updated doc files.
Updated language.rst with a security note.
Double asterisk is allowed at the beginning of the domain only.

Signed-off-by: Piotr Jablonski <[email protected]>

```release-note
The ** in matchpattern allows a short notation of multilevel subdomains,
eg. **.cilium.io matches a.cilium.io, a.b.cilium.io, a.b.c.cilium.io, etc.

This feature can break current configuration if a double asterisk was used inside of FQDN, eg. cil**.io
Before this enhancement two or more directly adjacent asterisks meant the same - a single asterisk, eg. cil**.io == cil*.io
The correct usage of the asterisk before this enhancement should be a single presence between other characters, eg. c*m.io, *.cil*.io
With this enhancement the correct usage of the single asterisk does not change
A double asterisk must be at the beginning of the FQDN string only, eg. **.cilium.io
The single and double asterisk can be present in the same FQDN if they are separated by a dot directly or indirectly, eg. **.*lium.io, **.cili*.io
```
@pjablonski123
Copy link
Copy Markdown
Author

@tklauser an upgrade note added. Pls let me know if anything is missing. Thx!

@tklauser
Copy link
Copy Markdown
Member

@tklauser an upgrade note added. Pls let me know if anything is missing. Thx!

Thanks @pjablonski123! The upgrade note looks good to me.

There seems to be a merge conflict. Could you please rebase the PR on latest main and resolve the conflict?

Once the remaining reviewers approved, this should be good to go.

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2024

This pull request has not seen any activity since it was marked stale.
Closing.

@Jeinhaus
Copy link
Copy Markdown

Jeinhaus commented Apr 4, 2024

@tklauser is there a chance this might still land?

@tklauser
Copy link
Copy Markdown
Member

tklauser commented Apr 4, 2024

Some concerns were voiced on the CFP issue, see #22081 (comment). I think these should be discussed and addressed first.

@tamilmani1989
Copy link
Copy Markdown
Contributor

@pjablonski123 are you still working on this item? If not, we would like to pick this up. cc: @vipul-21

@pjablonski123
Copy link
Copy Markdown
Author

@pjablonski123 are you still working on this item? If not, we would like to pick this up. cc: @vipul-21

Hi @tamilmani1989, I will reopen this and address the security concerns.

@vipul-21
Copy link
Copy Markdown
Contributor

vipul-21 commented Jan 6, 2025

@pjablonski123 Any update on this ? I can help if needed.

@TheBeeZee
Copy link
Copy Markdown
Contributor

This is now needed for k8s Cluster Network Policy, who's FQDN semantics require matching on multi-level subdomains. Unfortunately it seems like this PR has been effectively abandoned.

Is there still a plan to reopen this PR, or should I clone it and start a new discussion in a new PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. kind/community-contribution This was a contribution made by a community member. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CFP: CiliumNetworkPolicy toFQDNs.matchPattern should allow wildcarding of the . character to allow multiple levels of subdomains