Methodology

Measured numbers for the GMAN proxy and gateway, with constraints surfaced first. Reproducible scripts in scripts/; result artifacts in benchmarks/.

Local GMAN proxy overhead

Measured local GMAN proxy overhead in a deterministic mock-provider environment. Not a universal SDK tax — the published delta is this proxy's hot-path cost on the recorded environment, on the date in the artifact below. Real-world deployments add upstream network RTT to whatever provider the proxy points at; that is not measured here. See full methodology for what is and is not measured.

environment + config

run_at: 2026-05-16T17:10:50.337Z
node: v24.3.0
os: darwin-arm64
cpu: Apple M4 Pro
sample size: 200 per path
warmup: 20 per path (discarded)
payload: 76 bytes (same both paths)
keep-alive: enabled

path	p50 (ms)	p95 (ms)	p99 (ms)	mean (ms)
direct	0.118	0.183	0.273	0.13
proxied	0.334	0.548	1.086	0.36
delta · proxy tax	0.216	0.365	0.813	0.23

notes

· Measured against a deterministic local mock provider; real-world tax adds upstream network RTT to whichever provider you point the proxy at.
· Telemetry POSTs to the mock gateway are fire-and-forget (proxy.ts:264) and do not block the caller's response. Their completion is not included in measured latency.
· Keep-alive enabled on the shared http.Agent to mirror production connection reuse.
· Same payload size both paths (see config.payload_bytes).

Gateway ingest envelope

Measured sustained ingestion on a single Railway replica with free-tier Supabase and a hard 100 req/min/IP gateway ceiling. The envelope is the headline; the numbers are valid only inside it. Rate-limit reality is preserved in the raw data — pauses and 429s are not smoothed. full methodology, including what this benchmark is NOT.

deployment envelope

run_at: 2026-05-17T18:40:01.904Z
gateway: getmyagentnow-ui-production.up.railway.app
replicas: 1 Railway replica
db tier: free Supabase
rate limit: 100 req/min/IP
queue layer: none — synchronous insert
region: us-west
bench config: 5000 events · 5 in-flight

metric	value
sustained accepted / min	97
events sent	5000
events accepted	4643
drop %	7.14
429 rate-limit hits	357
5xx server errors	0
duration (min)	47.64
latency p50 (ms, accepted)	86.34
latency p95 (ms, accepted)	248.66
latency p99 (ms, accepted)	387.07

what this benchmark is NOT

· not a horizontally scaled deployment (single replica)
· not multi-region (US-West only)
· not durability-tested under node failure
· not representative of enterprise throughput
· not an end-to-end correctness test (synthetic payloads)