Skip to content

internal_upstream: reverse-propagate filter state across the internal listener boundary at close#45237

Open
MyUmmaGumma wants to merge 1 commit into
envoyproxy:mainfrom
MyUmmaGumma:reverse-passthrough-primitive
Open

internal_upstream: reverse-propagate filter state across the internal listener boundary at close#45237
MyUmmaGumma wants to merge 1 commit into
envoyproxy:mainfrom
MyUmmaGumma:reverse-passthrough-primitive

Conversation

@MyUmmaGumma

Copy link
Copy Markdown

internal_upstream: reverse-propagate filter state across the internal listener boundary at close

Commit Message

internal_upstream: reverse-propagate filter state across the internal listener boundary at close

Additional Description

When an internal listener is used (cluster transport_socket: envoy.transport_sockets.internal_upstreamtunneling_encap_listener → tcp_proxy), the userspace IoHandle pair currently propagates dynamic metadata and filter state objects forward (downstream → internal-listener stream) at connection creation via PassthroughState::initialize/mergeInto. There is no reverse path: filter state produced on the inner side never crosses back to the outer (downstream) side.

This is the first piece of a fix for #43977 — propagating a non-2xx CONNECT response status (e.g. a tcp_proxy upstream tunnel that returns 403) from the inner side back to the downstream HCM, so operators see the real upstream status instead of a canonical 503.

This change is the primitive only. It introduces:

  • A new StreamSharingMayImpactPooling::SharedWithDownstreamConnectionOnClose enum variant, symmetric to the existing SharedWithUpstreamConnection. Marks filter state objects that should flow upstream → downstream at upstream close.
  • FilterState::objectsSharedWithDownstreamConnectionOnClose() accessor (parent-chain walk identical to objectsSharedWithUpstreamConnection).
  • PassthroughState::captureReverse(FilterState&) and PassthroughState::mergeReverse(FilterState&). captureReverse is called on the inner side at connection close to harvest marked objects into the shared PassthroughState. mergeReverse is called on the outer side at its own close to deposit them into its connection's filter state. Already-set names on the recipient are skipped (reverse propagation must not overwrite).
  • IoHandle::addOnPreCloseCallback. IoHandleImpl::close() fires registered callbacks exactly once, before notifying the peer or tearing down the file event. This is the hook the inner side uses to call captureReverse while its connection's StreamInfo is still queryable.
  • Wiring on both sides of the boundary:
    • Inner side: ActiveInternalListener::newActiveConnection registers a pre-close callback that calls captureReverse(connection.streamInfo().filterState()), capturing the FilterStateSharedPtr by value so the callback is safe even if it fires from the IoHandle destructor (i.e. after the Connection is gone).
    • Outer side: InternalSocket::closeSocket calls mergeReverse(connection.streamInfo().filterState()) before delegating to the base PassthroughSocket::closeSocket.
  • One concrete consumer: TunnelingConfigHelperImpl::propagateResponseHeaders and propagateResponseTrailers now set their filter state objects with SharedWithDownstreamConnectionOnClose. With propagate_response_headers: true on the inner tcp_proxy, the captured response headers/trailers are now reverse-propagated to the downstream connection's filter state automatically.

The primitive is general: any inner-side network or HTTP filter can mark a filter state object with the new flag to opt into reverse propagation.

What's not in this PR

  • Plumbing the propagated filter state through Router::UpstreamRequest::onPoolFailure (currently it only flows through onPoolReady via setUpstreamFilterState). On the failure path, streamInfo:upstreamFilterState() and %UPSTREAM_FILTER_STATE% still see nothing. That fix is intended as a separate follow-up PR — it widens the connection pool callback API and is structurally bigger than this primitive.
  • Lua/access-log accessors beyond the existing surface. Existing %UPSTREAM_FILTER_STATE(envoy.tcp_proxy.propagate_response_headers:TYPED)% continues to work on the success path; the failure-path fix is in the followup.
  • Any change to the wire-level behavior of tcp_proxy itself (CONNECT semantics, reset behavior, retries).

Risk Level

Low.

  • The new enum variant is opt-in. No existing filter state objects set this flag, so no existing behavior changes.
  • The IoHandle::addOnPreCloseCallback interface is new and has zero existing registrations outside this PR's wiring; the close path's behavior is unchanged when the callback list is empty.
  • The change to propagateResponseHeaders/propagateResponseTrailers requires propagate_response_headers: true (already opt-in) to fire at all. With the option off, the call is a no-op as before. With the option on, the only observable difference is that the captured TunnelResponseHeaders filter state object additionally appears on the downstream connection's filter state at upstream close — operators get more information, not different information.
  • The reverse-merge skips names already set on the recipient, so a pre-existing filter state object on the downstream connection cannot be silently overwritten.

Testing

  • Existing propagate_response_headers integration tests cover the inner-side access-log path; the change to propagateResponseHeaders/propagateResponseTrailers only adds an opt-in sharing flag to setData and does not alter the data placed on the inner-side filter state.
  • New unit tests in test/extensions/io_socket/user_space/io_handle_impl_test.cc exercising captureReversemergeReverse with a marked filter state object.
  • New unit tests in test/extensions/transport_sockets/internal_upstream/internal_upstream_test.cc verifying InternalSocket::closeSocket invokes mergeReverse exactly once and passes the right filter state.
  • Validated end-to-end on a private envoy deployment in a real-traffic test environment: an inner tcp_proxy CONNECT receiving a 403 from its upstream now lands a TunnelResponseHeaders filter state object (with the propagated :status) on the downstream connection's filter state. Without the patch, the downstream connection's filter state is empty for this case.

Docs Changes

api/envoy/extensions/filters/network/tcp_proxy/v3/tcp_proxy.proto updated to note that when propagate_response_headers/propagate_response_trailers are true and the tcp_proxy is on the inner side of an internal-listener boundary, the resulting filter state objects also become available on the downstream (outer-side) connection's filter state at upstream close. No new config fields.

Release Notes

Added under new_features in changelogs/current.yaml:

tcp_proxy: when propagate_response_headers or propagate_response_trailers is enabled and the tcp_proxy sits on the inner side of an internal-listener boundary, the captured response headers/trailers filter state object is now additionally reverse-propagated to the outer (downstream) connection's filter state at upstream close. Operators reading filter state from the downstream side via %UPSTREAM_FILTER_STATE(envoy.tcp_proxy.propagate_response_headers:TYPED)% (or equivalent) on the success path will now also see the value when the inner upstream returns a non-2xx response on CONNECT. Pre-existing names on the recipient are not overwritten.

And an additional entry for the generic primitive:

filter state: added StreamSharingMayImpactPooling::SharedWithDownstreamConnectionOnClose for marking filter state objects that should be reverse-propagated from the upstream (inner-side) to the downstream (outer-side) connection at upstream close. Currently honored across the internal-listener boundary by PassthroughState / InternalSocket.

Platform Specific Features

N/A.

Fixes

#43977

@repokitteh-read-only

Copy link
Copy Markdown

Hi @MyUmmaGumma, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #45237 was opened by MyUmmaGumma.

see: more, trace.

@repokitteh-read-only

Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #45237 was opened by MyUmmaGumma.

see: more, trace.

… listener boundary at close

Adds StreamSharingMayImpactPooling::SharedWithDownstreamConnectionOnClose,
PassthroughState::captureReverse/mergeReverse, and IoHandle::addOnPreCloseCallback.
The internal listener now copies marked filter state objects from the inner
(upstream) connection back to the outer (downstream) connection at upstream close.

tcp_proxy.propagateResponseHeaders/propagateResponseTrailers now set their
filter state with the new flag, so TunnelResponseHeaders/Trailers captured via
propagate_response_headers: true on an inner-side tcp_proxy are reverse-
propagated to the outer connection without further config.

Reverse propagation must not overwrite an existing name on the recipient.

Refs envoyproxy#43977.

Risk Level: Low (opt-in; default behavior unchanged).
Testing: unit tests for filter_state, passthrough_state, and InternalSocket close.
Docs Changes: tcp_proxy.proto note + changelogs/current.yaml entries.
Release Notes: added under filter_state and tcp_proxy areas.
Platform Specific Features: N/A.

Signed-off-by: Keerti Narayan <keerti2882@gmail.com>
@MyUmmaGumma MyUmmaGumma force-pushed the reverse-passthrough-primitive branch from e35635d to 1ff2d25 Compare May 25, 2026 17:59
@MyUmmaGumma MyUmmaGumma requested a deployment to external-contributors May 25, 2026 17:59 — with GitHub Actions Waiting
@KBaichoo

Copy link
Copy Markdown
Contributor

/assign @yanjunxiang-google

@yanjunxiang-google

Copy link
Copy Markdown
Contributor

/assign @kyessenov

@yanjunxiang-google

yanjunxiang-google commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@MyUmmaGumma before I going into the details of the PR, I saw #44157 is also fixing #43977 and merged. Could you please share some background about this PR and and #44157?

@MyUmmaGumma

Copy link
Copy Markdown
Author

@MyUmmaGumma before I going into the details of the PR, I saw #44157 is also fixing #43977 and merged. Could you please share some background about this PR and and #44157?

Before #44157: When an upstream returned a non-2xx CONNECT response, that status was discarded on the inner-side (router) unless propagate_response_headers: true was set on tcp_proxy.
#44157 now unconditionally captures it as tunnel_response_status_ and surfaces it via UPSTREAM_TRANSPORT_FAILURE_REASON. Purely an inner-side (referring to the diagram below) capture fix.

This PR: In the internal-listener tunneling topology (outer L7 → internal-listener pair → inner tcp_proxy → upstream CONNECT), the captured response info lives in the inner upstream-connection's filter state. The outer L7 — where the original request was received and where a follow-up consumer would act — does not see it. Filter state propagates outward (SharedWithUpstreamConnection), but not inward across the internal-listener boundary. #45237 introduces a generic reverse-propagation primitive (SharedWithDownstreamConnectionOnClose) and uses it from tcp_proxy on the existing propagate_response_headers filter state.

The PRs are related to a similar issue but independent. #45237 does NOT consume #44157's new tunnel_response_status_ field as its data source. It propagates the pre-existing propagate_response_headers filter state (predates #44157, already carries the full :status of non-2xx CONNECT responses). Different data sources — they don't depend on each other.

On our internal deployment the shape that this takes is that auth denials on CONNECT currently surface to the caller as 503s though the access logs on the internal listener can record the 403. With this PR and a follow up the caller can actually see the 403.

Screenshot 2026-06-10 at 9 18 33 AM

@MyUmmaGumma

Copy link
Copy Markdown
Author

@yanjunxiang-google @kyessenov - please let me know if anything needs work/explanation. This will be a very useful fix on our internal deployments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants