Skip to content

test: new fake OpenAI server for shared, deterministic tests #914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 22, 2025

Conversation

codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Jul 21, 2025

Description

This PR introduces a fake OpenAI API server using go-vcr to enable fast, credential-free, and reproducible integration tests for the AI Gateway external processor (extproc). The changes enhance testing reliability, eliminate dependency on live API calls, and reduce CI flakiness and costs. Key changes include:

  • Fake Shared OpenAI Server: Added internal/testing/fakeopenai package with pre-recorded YAML cassettes for common OpenAI endpoints (e.g., chat completions, streaming, tools, multimodal). Supports automatic recording of new interactions when OPENAI_API_KEY is set.
  • ExtProc Testing Enhancements:
    • On-demand extproc binary build in tests (via EXTPROC_BIN env or auto-compile).
    • New tests/extproc suite using the fake server, covering various request scenarios with assertions on responses and status codes.
    • Emits startup message ("AI Gateway External Processor is ready") to stderr for reliable readiness detection.
  • Config & Tooling Updates: Added .env.ollama and a working docker-compose configuration for ad-hoc testing of extproc
  • Dependencies: Added gopkg.in/dnaeon/go-vcr.v4 for cassette recording/replay.

Related Issues/PRs (if applicable)

Special notes for reviewers (if applicable)

Cassettes were recorded based on analysis of OpenInference instrumentation and edge cases that often break traces.

Future work

OpenAI is not the only inference API supported, but it is special as it is
the most common frontend and backend for AI Gateway. This is why we expose the
requests, as we will often proxy these even if the backend is not OpenAI
compatible.

The recording process would remain consistent for other cloud services, such as
Anthropic or Bedrock, though there could be variations in how requests are
scrubbed for secrets or handled for request signing. In a future refactoring,
we could extract the core recording infrastructure into a separate package,
reducing this one to just cassette constants and OpenAI-specific request
recording and handling details. Most of the code could be reused for other
backends.

For additional insights, refer to OpenTelemetry instrumentation, which often
employs VCR for LLM frameworks as well.

Here are key parts of the OpenTelemetry Botocore Bedrock instrumentation that
deals with request signing and recording:

Here are key parts of OpenInference Anthropic instrumentation, which handles
their endpoint.

@codefromthecrypt codefromthecrypt requested a review from a team as a code owner July 21, 2025 13:27
@codefromthecrypt codefromthecrypt changed the title Add Fake OpenAI Server for Offline, Deterministic ExtProc Testing test: new fake OpenAI server for offline, deterministic ExtProc testing Jul 21, 2025
@codefromthecrypt codefromthecrypt changed the title test: new fake OpenAI server for offline, deterministic ExtProc testing test: new fake OpenAI server for shared, deterministic tests Jul 21, 2025
@@ -0,0 +1,55 @@
# AI Gateway ExtProc Tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI once this is merged, I will progress to do similar for aigw (standalone), so that we can easily test ad-hoc with the same config used in unit tests, ideally port collision free ***

** (there is a little bit of dancing in cmd/aigw we'll need to do because of duplicate prom registrations, but we'll get through)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you also mention the other existing tests using testupstream and how they differ from this https://github.com/envoyproxy/ai-gateway/blob/main/tests/extproc/testupstream_test.go ? Or maybe cutting a subdirectory for all the new files in this PRs better for separation since the existing tests are not using the ones added here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I wanted to remove the openai -> openai tests from testupstream_test.go once the approach was ok as we don't want redundant I think.

}

// ReadChatCompletionStream reads and parses OpenAI chat completion chunks from an SSE stream..
func ReadChatCompletionStream(r io.Reader) ([]ChatCompletionChunk, string, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a utility to verify OpenInference formatted otel span data, which isn't in this change. I can remove it or leave it.

Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments but before going forward i would love to understand if it's possible to deprecate testupstream_test.go tests where we already use a "fake" homemade server without real providers, so this is not the first extproc tests that do not rely on external credentials in other words. I think this these test and this fakeopenai tests are somewhat overlapped in the sense of using a fake server (not the replay portion which i like about this one), so i would like to migrate existing ones to this recording style. Or, would it be possible to refactor the existing testupstream to use the yaml recorded by go-vcr, which wouldn't require us to write fake servers for all the translation targets (AWS, Anthropic, and Google Vertex).

Meanwhile, to move things forward without too much migration, I would slightly prefer keep one envoy config in tests/extproc test rather than two different envoy configs in the same directly where each will be used in different tests testupstream_test.go,realproviders_test.go,custom_extproc_test.go vs chat_completions_test.go. Maybe we can have another test class tests/extproc/go-vcr or tests/extproc-vcr or whatever new dir until we migrate the existing ones (if possible?) then we won't need to modify the current extproc/REAME.md you added to mention both different classes of tests (existing ones vs vcr). Not strong preference but just that i would like to make it clear that which file does what in tests/extproc dir. Maybe prefixing the new files here with "vcr_" would be another way... what do you think...

@@ -160,6 +160,10 @@ func Main(ctx context.Context, args []string, stderr io.Writer) (err error) {
l.Error("Failed to shutdown health check server gracefully", "error", err)
}
}()

// Emit startup message to stderr when all listeners are ready.
fmt.Fprintln(stderr, "AI Gateway External Processor is ready")
Copy link
Member

@mathetake mathetake Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to use the logger l for consistency or any reason not to do it? otherwise, only this line would be non structured text but other logs are all structured json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this was intentionally using non-buffered stderr (like envoy does) which allows processes to reliably block on it. Should I put comments to that effect or should we just hope the buffering doesn't impact barriers? You can push your preference if I'm not around.

Copy link
Member

@mathetake mathetake Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logger is using the same stderr io.Writer variable l := slog.New(slog.NewTextHandler(stderr, &slog.HandlerOptions{Level: flags.logLevel}) and i think it works exactly the same in terms of blocking? I will push the change

@@ -205,3 +208,54 @@ func TestStartHealthCheckServer(t *testing.T) {
})
}
}

// TestExtProcStartupMessage ensures other programs can rely on the startup message to STDERR.
func TestExtProcStartupMessage(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 212 to 213
request_attributes:
- xds.upstream_host_metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need this at the router level extproc, so could you try removing it? otherwise there seems a bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try removing it, in my Otel tests there were errors in the log about missing attributes and finally hunted down to this. It might now be the case anymore or triggered with these files.

"@type": type.googleapis.com/envoy.config.route.v3.FilterConfig
disabled: true
http_filters:
# Simulate real config injected via EnvoyExtensionPolicy
Copy link
Member

@mathetake mathetake Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so unfortunately this is totally different from the actual envoy config generated by both aigw and the k8s controller mode. I think this is accidentally working and i have a slight idea of why it's working if it's already passing tests, but this differs from the actual how it works. In that sense aigw translate is useless since the translation result does not reflect the reality. That is because (unfortunately in my opinion, due to the lack of many API in EG) the final envoy resources are fine tuned via the extension server mechanism in internal/extensionserver package, which makes the translation result different from this yaml.

If you take a look at the original envoy.yaml in this directory, you would see two extprocs filters (!= not two processes, just filters) are configured, and that's intentional to enable the fallback across multiple providers like falling back from local model to AWS/OpenAI etc. https://aigateway.envoyproxy.io/docs/latest/concepts/architecture/data-plane. The existing envoy.yaml is aligned with the actual config generated by aigw run & the control plane in other words

I am ok to go with this as-is since i see the main purpose of this config is to have a minimal working stuff, not for the alignment with exactly how it works. On the other hand, this might cause a trouble like; while things working at the extproc layer with this tests, not working at k8s & "aigw run" command since the configuration setups are different from here and there. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where I started was Otel and I couldn't find any yaml that worked without errors. So, I pared this down to the minimum that would work with extproc (ack I have a partial understanding still). I do prefer parity with intent more than what I was able to get working, so happy to try to match esp as I have tests now!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing envoy.yaml is tightly coupled with many clouds and slow testupstream which makes it difficult to progress before "stop the world" refactor of everything. I will make the openai one as similar as possible, yet still runnable with docker against ollama which is a very important goal of this PR. We need to be able to quickly ad-hoc especially when adding tracing to this, and there will be overlap with the "testupstream" thing vs VCR until other things migrate. Doing that can happen in later PRs and aren't urgent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I think I get what you are saying more. Our destination is parity with real envoy config aigw would make, and a simplified envoy.yaml might pass for the wrong reasons. Yet, challenges remain..

It's a chicken-egg issue at this low layer of extproc to even know the config it might beed to run, because the owner of that generation is aigw -> eg, etc. We have envoy.yaml which while hand edited, is likely very close, and we have good reason to be cautious about deviations.

Future idea: Run aigw as a library with a startup hook to func-e, capture the real envoy config and have Go tests compare that with test config in this directory. For example, it could have a working aigw yaml for ollama and simply splat out the envoy. There is chicken egg on this part, too, but something like this would reduce drift and misconfiguration anxiety. Also, for newcomers like me, it makes it a lot more transparent what we actually must have for testing vs all the stuff that happens to be in a big file ;)

Tactically in this PR: Align new config closely to existing (e.g., dual filters), even if the new tests don't use those features, because it is closer to prod.

Hope this works, but you can feel free to revise if not!

@mathetake mathetake self-assigned this Jul 21, 2025
@codefromthecrypt
Copy link
Contributor Author

I cleaned up the fakeopenai server, but I still have some work to do to try to use the old envoy yaml, also remove redundant tests, and try to use internal/apischema/openai to describe the recordings. should be done in an hour or two

@codefromthecrypt
Copy link
Contributor Author

ok I was unable to get the tests/extproc/envoy.yaml working with my test fixtures and also have it pass with ollama locally etc. It is a "kitchen sink" config, containing many different things used by all tests, and I don't think that is sustainable for reasons including deleting several of my hours trying to re-use it despite being mostly not about local openai connections.

Solving the entire surface of all tests is too much to do in this PR, and I'm not the right one to do that (yet at least). However, with this in, I can progress tracing which is something I'm a specialist at. Then, as I gain experience maybe I can fail less at this configuration.

TL;DR; I put the new tests into a "vcr" directory so that we can at least begin the process of less monolithic test execution, free of port conflicts and without risk of accidentally copy/pasting wrong responses etc. Feel free to revise this branch without my permission. Thanks for your attention to this matter ;)

Adrian Cole and others added 11 commits July 22, 2025 19:59
This PR introduces a fake OpenAI API server using `go-vcr` to enable fast, credential-free, and reproducible integration tests for the AI Gateway external processor (`extproc`). Key changes:

- **Fake Server Implementation**: New `internal/testing/fakeopenai` package with pre-recorded "cassettes" (YAML) for common OpenAI endpoints like chat completions (basic, streaming, tools, multimodal, etc.). Supports automatic recording of new interactions when `OPENAI_API_KEY` is set.
- **ExtProc Testing Enhancements**:
  - On-demand `extproc` binary build in tests (via `EXTPROC_BIN` env or auto-compile).
  - New `tests/extproc` suite using the fake server, covering various request scenarios with assertions on responses and status codes.
  - Startup message ("AI Gateway External Processor is ready") emitted to stderr for reliable readiness detection.
- **Config & Tooling Updates**: Added `.env.ollama` for model defaults; updated Makefile, licenses, and lint ignores; new Docker Compose setup for manual testing with Ollama and Envoy.
- **Dependencies**: Added `gopkg.in/dnaeon/go-vcr.v4` for cassette recording/replay.

**Benefits**: Tests now run offline, deterministically, and without API costs/keys. Reduces flakiness and speeds up CI. No breaking changes—existing tests remain compatible.

To record new cassettes: Set `OPENAI_API_KEY` and run `go test -run TestNewRequest` in `fakeopenai`.

**Notes**: Cassettes recorded were based on analysis of OpenInference instrumentation
and the edge cases that often break traces. The design is inspired by work done by
@anuraaga in different language families in OpenTelemetry. The motivation is preparation
for OpenTelemetry work, as well as later sharing the same cassettes to aigw and e2e tests.

Signed-off-by: Adrian Cole <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
@codefromthecrypt
Copy link
Contributor Author

added a footnote into the description about future work, taken from the fakeopenai README. signing off, and enjoy!

@mathetake
Copy link
Member

TestChatCompletions race during the test... let me take alook ..

Read at 0x00c00019b4e8 by goroutine 13:
  strings.(*Builder).Write()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/strings/builder.go:83 +0xc0
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:431 +0x258
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.Copy()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:388 +0x64
  os.genericWriteTo()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/file.go:275 +0x18
  os.(*File).WriteTo()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/file.go:253 +0x90
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:411 +0xa8
  io.Copy()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:388 +0x5c
  os/exec.(*Cmd).writerDescriptor.func1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:596 +0x38
  os/exec.(*Cmd).Start.func2()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:749 +0x3c
  os/exec.(*Cmd).Start.gowrap1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:761 +0x44

Previous write at 0x00c00019b4e8 by goroutine 14:
  strings.(*Builder).Write()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/strings/builder.go:83 +0x13c
  io.(*multiWriter).Write()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/multi.go:85 +0xa8
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:431 +0x258
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:429 +0x218
  io.Copy()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:388 +0x64
  os.genericWriteTo()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/file.go:275 +0x18
  os.(*File).WriteTo()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/file.go:253 +0x90
  io.copyBuffer()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:411 +0xa8
  io.Copy()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/io/io.go:388 +0x5c
  os/exec.(*Cmd).writerDescriptor.func1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:596 +0x38
  os/exec.(*Cmd).Start.func2()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:749 +0x3c
  os/exec.(*Cmd).Start.gowrap1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:761 +0x44

Goroutine 13 (running) created at:
  os/exec.(*Cmd).Start()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:748 +0xbe4
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.requireEnvoyWithOutput()
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/main_test.go:155 +0x814
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.setupTestEnvironment()
make: *** [test-extproc] Error 1
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/main_test.go:201 +0x260
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.TestChatCompletions()
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/chat_completions_test.go:21 +0x30
  testing.tRunner()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/testing/testing.go:1792 +0x180
  testing.(*T).Run.gowrap1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/testing/testing.go:1851 +0x40

Goroutine 14 (running) created at:
  os/exec.(*Cmd).Start()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/os/exec/exec.go:748 +0xbe4
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.requireEnvoyWithOutput()
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/main_test.go:155 +0x814
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.setupTestEnvironment()
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/main_test.go:201 +0x260
  github.com/envoyproxy/ai-gateway/tests/extproc/vcr.TestChatCompletions()
      /Users/runner/work/ai-gateway/ai-gateway/tests/extproc/vcr/chat_completions_test.go:21 +0x30
  testing.tRunner()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/testing/testing.go:1792 +0x180
  testing.(*T).Run.gowrap1()
      /Users/runner/hostedtoolcache/go/1.24.4/arm64/src/testing/testing.go:1851 +0x40
==================
--- FAIL: TestChatCompletions (0.44s)
    --- FAIL: TestChatCompletions/chat-basic (0.02s)
        testing.go:1490: race detected during execution of test

Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯! the exciting future of smooth e2e testing experience has begun

@mathetake
Copy link
Member

why only ubuntu extproc jobs are failing...

@mathetake
Copy link
Member

can't repro on my local ubuntu but the seems not a flake on CI ..

@mathetake mathetake merged commit 468930e into envoyproxy:main Jul 22, 2025
22 of 23 checks passed
@codefromthecrypt codefromthecrypt deleted the fakeopenai branch July 22, 2025 22:48
@@ -129,6 +129,8 @@ func requireRunEnvoy(t *testing.T, accessLogPath string) {
"-c", envoyYamlPath,
"--log-level", "warn",
"--concurrency", strconv.Itoa(max(runtime.NumCPU(), 2)),
// This allows multiple Envoy instances to run in parallel.
"--base-id", strconv.Itoa(time.Now().Nanosecond()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all your base are belong to us

@codefromthecrypt
Copy link
Contributor Author

Thanks for helping this in safely, @mathetake, like the safe buffers and also didn't know about the base-id thing. TIL!

mathetake pushed a commit that referenced this pull request Jul 26, 2025
**Description**

This adds openai recordings for all features supported by openinference
spans in preparation of the open telemetry pull request.

**Related Issues/PRs (if applicable)**

supports #791 follow-up to #914 

**Special notes for reviewers (if applicable)**

was pretty tricky to trigger the audio and image features with concise
requests!

Signed-off-by: Adrian Cole <[email protected]>
Co-authored-by: Adrian Cole <[email protected]>
codefromthecrypt pushed a commit to codefromthecrypt/ai-gateway that referenced this pull request Jul 29, 2025
This PR refactors the extproc startup process to make it more
robust around network listeners.

It also starts the process of consolidating new (VCR) test
infrastructure by migrating custom metrics and base types of
existing tests to it.

Finally, this improves the README of the custom metrics example
as before, it wasn't obvious what it did and its limitations.

The net result is a win on test infrastructure as nothing uses
static ports (conflicts) anymore.

Key changes include:

- **Startup Refinements**: Introduced a unified `listen` helper using `net.ListenConfig` for creating listeners (extproc, metrics, health). This provides better error context (e.g., named failures like "failed to listen for metrics") and simplifies server initialization. Updated metrics and health servers to accept listeners directly, reducing duplication and potential races.
- **Custom Metrics Example**: Expanded the `examples/extproc_custom_metrics` README with detailed explanations of default metrics flow, example behavior, and use cases (e.g., enhanced performance tracking, cost optimization). The example now logs events and returns fixed TTFT/ITL values for easier verification.
- **Testing Improvements**:
  - Moved binary building logic (extproc and testupstream) to shared helpers that build on-demand, supporting custom binaries (e.g., for examples).
  - Refactored tests to use a centralized `TestEnvironment` that captures stdout/stderr for all components (Envoy, extproc, upstream), enabling parallel runs and easier debugging via unified logs.
  - Switched Envoy access logs to stdout for consistency in tests.
  - Added a new test for custom metrics integration, verifying dynamic metadata in access logs.
- **Other Cleanups**: Adjusted timeouts, fixed minor test flakes (e.g., stale sockets), and improved log handling in test servers.

**Related Issues/PRs (if applicable)**

Follow-up to envoyproxy#914 and envoyproxy#944.

Signed-off-by: Adrian Cole <[email protected]>
mathetake added a commit that referenced this pull request Jul 29, 2025
**Description**

This PR refactors the extproc startup process to make it more robust
around network listeners.

It also starts the process of consolidating new (VCR) test
infrastructure by migrating custom metrics and base types of existing
tests to it.

Finally, this improves the README of the custom metrics example as
before, it wasn't obvious what it did and its limitations.

The net result is a win on test infrastructure as nothing uses static
ports (conflicts) anymore.

Key changes include:

- **Startup Refinements**: Introduced a unified `listen` helper using
`net.ListenConfig` for creating listeners (extproc, metrics, health).
This provides better error context (e.g., named failures like "failed to
listen for metrics") and simplifies server initialization. Updated
metrics and health servers to accept listeners directly, reducing
duplication and potential races.
- **Custom Metrics Example**: Expanded the
`examples/extproc_custom_metrics` README with detailed explanations of
default metrics flow, example behavior, and use cases (e.g., enhanced
performance tracking, cost optimization). The example now logs events
and returns fixed TTFT/ITL values for easier verification.
- **Testing Improvements**:
- Moved binary building logic (extproc and testupstream) to shared
helpers that build on-demand, supporting custom binaries (e.g., for
examples).
- Refactored tests to use a centralized `TestEnvironment` that captures
stdout/stderr for all components (Envoy, extproc, upstream), enabling
parallel runs and easier debugging via unified logs.
  - Switched Envoy access logs to stdout for consistency in tests.
- Added a new test for custom metrics integration, verifying dynamic
metadata in access logs.
- **Other Cleanups**: Adjusted timeouts, fixed minor test flakes (e.g.,
stale sockets), and improved log handling in test servers.

**Related Issues/PRs (if applicable)**

Follow-up to #914 and #944.

---------

Signed-off-by: Adrian Cole <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants