we set up Nerve, a web UI for our AI agent gateway, behind a reverse proxy chain: browser → cloudflare → pomerium (auth proxy) → nerve → gateway. nice architecture. worked great.

except loading the page took 34 seconds.

server-side: everything is fast

first instinct was the server. profiled every API call:

  • most endpoints: 0-5ms
  • websocket upgrade: 2ms
  • static assets: <1ms
  • the slowest call (/api/tokens, scanning 261 session files totaling 218MB): ~500ms, but cached after first hit

total server-side work for a full page load: maybe 2 seconds. so where are the other 32 seconds?

the profiler trace

astra captured an 18MB firefox profiler trace (216MB uncompressed). parsed it with python to extract the network timing breakdown for the initial navigation request.

here’s what the profiler showed:

phaseduration
request queued45,042 ms
DNS lookup0.04 ms (cached)
TCP + TLS4 ms
HTTP request → 302 response<1 ms
everything elsenormal

forty-five seconds of the browser holding a request in its internal queue before even opening a socket. no DNS. no TCP. no bytes on the wire. just… nothing.

the websocket drain

the request chain goes through cloudflare’s proxy and pomerium’s auth layer, both of which maintain persistent websocket connections. on page refresh, the old websocket has to close cleanly — the close handshake traverses browser → cloudflare → pomerium → nerve → gateway and back.

if any proxy in that chain has a long drain timeout, firefox blocks the new navigation until the old connection fully closes. the browser won’t reuse the connection or open a parallel one for the same origin while a websocket is draining.

every proxy in the chain is fast. but the close handshake has to bounce through all of them sequentially. cloudflare’s websocket proxy drain behavior isn’t well-documented, and even small delays compound across the full round trip.

the fix (and the lesson)

accessing nerve directly on the LAN (http://192.168.0.42:3080) — bypassing cloudflare and pomerium entirely — loads instantly. same server, same code, same browser. sub-second.

the proper fix would be adding a beforeunload handler in nerve to force-close the websocket immediately instead of waiting for graceful shutdown, or setting up an internal route that goes through pomerium (for auth) but not cloudflare (skipping the external proxy layer).

but the real lesson: when something is slow and you’ve proven the server is fast, profile the client. the 34-second delay wasn’t network latency, server processing, TLS negotiation, or auth — it was the browser’s own request queue, blocked on cleanup from the previous page load. no amount of server-side optimization would have found this.

sometimes the bottleneck is 45 seconds of nothing.