Skip to content

Conversation

@snorwin
Copy link
Member

@snorwin snorwin commented Dec 18, 2025

What type of PR is this?

/area conformance-test

What this PR does / why we need it:
In #3630, the Gateway API’s interaction with connection coalescing was improved as part of GEP-3567. However, these changes are not yet reflected in the conformance tests.

Which issue(s) this PR fixes:

N/A

Does this PR introduce a user-facing change?:

Added conformance tests validating Gateway behavior for connection coalescing when SNI and Host headers do not match, including correct use of HTTP 421 for potentially misdirected requests.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. area/conformance-test Issues or PRs related to Conformance tests. labels Dec 18, 2025
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 18, 2025
@snorwin
Copy link
Member Author

snorwin commented Dec 18, 2025

/cc @robscott

@snorwin
Copy link
Member Author

snorwin commented Dec 18, 2025

/retest

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @snorwin!

Comment on lines 66 to 67
{host: "example.org", serverName: "second-example.org", statusCode: 421},
{host: "second-example.org", serverName: "example.org", statusCode: 421},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the relevant part of the spec is here:

// the Gateway SHOULD return a 421.

We're actually using RFC 2119 interpretation of keywords here, so SHOULD is a recommendation, not a requirement. With that said, I think we could justify an extended test (separate feature name) to cover this.

cc @youngnick @rikatz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we enforce this behaviour only for reused connections? If Cx is estabilishing a new connection per request then misdirected does not make sense.

@snorwin can you verify that connection is being reused in this test? I was looking into RoundTripper and I can see that DisableKeepAlives is set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, a request might have been misdirected, deliberately or accidentally, such that the information within a received Host header field differs from the connection's host or port.
https://www.rfc-editor.org/rfc/rfc9110.html#name-rejecting-misdirected-reque

In the case of TLS, the connection’s host is determined by the SNI extension sent during the initial TLS handshake. In my opinion, it should not matter whether the connection was established specifically for this request or reused from a previous one.

@kl52752 which HTTP status code would you expect when a request sent over a newly established connection has a mismatch between the SNI value and the Host header?

Related references:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to:

// * If another Listener has an exact match or more specific wildcard entry,
// the Gateway SHOULD return a 421.

Imagine we have 2 listeners on the same port one with *.example.com and second one with foo.example.com.

Like you mentioned SNI is validated in initial TLS handshake, so if client is connecting to *.example.com and this is the only "destination" for this request I think that 200 should be returned because TLS handshake was performed for *.example.com.

What we should test is that after client connects to *.example.com and reuse this connection to foo.example.com on different listener then 421 should be returned.

Does it makes sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your example. Let’s assume we have 2 listeners on the same port one with *.example.com and second one with foo.example.com.

What HTTP error code would you expect for a request on a new connection with SNI bar.example.com and a Host header foo.example.com?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC:

  • Anything around 421 should be a separate test since its a SHOULD
  • 421 should only be returned when it would otherwise match, otherwise its a 404. We should have cases to handle this, and ensure we do not return 421 when there is not otherwise a match

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, one thing i think is vague in the spec

	//
	// * If another Listener has an exact match or more specific wildcard entry,
	//   the Gateway SHOULD return a 421.

Can an empty listener.hostname be "more specific wildcard entry"? On one hand, empty != wildcard, but on the other hand, empty is a complete wildcard.

We treat case 4 {host: "unknown-example.org", serverName: "second-example.org", statusCode: 404}, as a 404. But, IMO, this seems like it should be misdirected: the fact the 'other listener' has no hostname disqualifies is irrelevant to users and show behave the same as a wildcard IMO.

I am leaning towards changing this test and making the spec slightly tweaked...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, a closer look suggests this must be a 421 even with the current wording of the spec:

if (another listener has an exact match or more specific wildcard) {
	// per discussion, this is currently NOT hit, but IMO should be
	421
} else if (current listener does not match host) {
	// This is hit
	if (another listener does match the host) {
		// This IS hit, because it doesn't disqualify unspecified listener hostnames
		421
	} else {
		404 // The test currently incorrectly asserts 404 here
	}
}

Copy link
Member Author

@snorwin snorwin Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@howardjohn I’ve moved the test cases into a dedicated test. If you have a chance to review it and run them against your implementation, I’d be really appreciated.

And I agree, no (listener) hostname should be treated as "*".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update the spec to clarify this as well but LGTM. Thanks!

Copy link
Contributor

@kl52752 kl52752 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@howardjohn
Copy link
Contributor

howardjohn commented Dec 22, 2025

I have a passing implementation of this, pending discussion in #4364 (comment): agentgateway/agentgateway#763

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 29, 2025
@snorwin snorwin force-pushed the confomance-connection-coalescing branch from 8673dd3 to 0ecf912 Compare December 29, 2025 13:49
@snorwin snorwin force-pushed the confomance-connection-coalescing branch from 0ecf912 to 00a3c72 Compare December 29, 2025 14:08
…d introdcue new gateway feature

Signed-off-by: Norwin Schnyder <[email protected]>
@snorwin snorwin force-pushed the confomance-connection-coalescing branch from 00a3c72 to 2c27ece Compare December 29, 2025 15:01
Copy link
Contributor

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested the the new feature with agentgateway and its passing

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: howardjohn, kl52752, snorwin
Once this PR has been reviewed and has the lgtm label, please assign shaneutt for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

howardjohn added a commit to agentgateway/agentgateway that referenced this pull request Jan 2, 2026
Copy link
Member

@rostislavbobo rostislavbobo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @snorwin , a very important conformance! I left a couple of comments to simplify it


var HTTPRouteHTTPSListenerDetectMisdirectedRequests = suite.ConformanceTest{
ShortName: "HTTPRouteHTTPSListenerDetectMisdirectedRequests",
Description: "HTTPS listeners on the same port detect misdirected requests and returning HTTP 421 when appropriate",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Description: "HTTPS listeners on the same port detect misdirected requests and returning HTTP 421 when appropriate",
Description: "HTTPS listeners on the same port detect misdirected requests and return HTTP 421 when appropriate",

statusCode int
backend string
serverName string
}{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we group test cases by the selected Listener (based on SNI) rather than host? For each test case I had to find the Listener first, and only then think of host matching. Grouping by selected Listener would ease the review and help ensure we cover all the cases.

Additionally, can we place serverName before host to match the natural processing order?

namespace: gateway-conformance-infra
- name: https-with-hostname-matching-wildcard
port: 443
hostname: "third-example.wildcard.org"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use "fourth-example.wildcard.org" for the 4th listener, similar how 2nd listener has "second-example.org", or simply move it to the third position.

Asking, because it was difficult to understand {host: "third-example.wildcard.org", serverName: "fourth-example.wildcard.org", statusCode: 421}, where the 3rd listener with * hostname gets selected for "4th-example.wildcard.org" SNI, but actually 4th listener with 3rd hostname was a better match for "3rd-example.wildcard.org" Host 🙂

{host: "second-example.org", serverName: "second-example.org", statusCode: 200, backend: "infra-backend-v2"},
{host: "second-example.org", serverName: "example.org", statusCode: 421},

{host: "third-example.wildcard.org", serverName: "third-example.wildcard.org", statusCode: 200, backend: "infra-backend-v1"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was confused to see infra-backend-v1. Let's leave a comment that HTTPRoute 4 was selected or use infra-backend-v4

Comment on lines +85 to +86
{host: "unknown-example.org", serverName: "example.org", statusCode: 404},
{host: "unknown-example.org", serverName: "unknown-example.org", statusCode: 404},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these two 404 cases test what we want. These two 404 tests verify that the proxy returns a 404 when an L7 Host doesn't match the HTTPRoute(s) hostnames. But instead, we need to verify that the proxy returns a 404 when the L7 Host is not matched by any Listener.

	// * If the current Listener (selected by SNI matching during ClientHello)
	//   does not match the Host:
	//     * If no other Listener matches the Host, the Gateway MUST return a
	//       404.

To test 404 properly, we need to have

  1. Gateway without a catch-all Listener
  2. HTTPRoutes without hostnames

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend removing all hostnames from the HTTPRoutes. They are irrelevant for HTTPS connection coalescing testing – only the Listener hostnames matter.

namespace: gateway-conformance-infra
spec:
parentRefs:
- name: same-namespace-with-https-listener
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add sectionName and explicitly reference the 1st Listener. HTTPRoute attachment and selection don't impact HTTPS connection coalescing, so the smaller and simpler HTTPRoute config the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/conformance-test Issues or PRs related to Conformance tests. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants