Three layers
telequick_rpc_<method> to produce a wire-ready buffer, then writes
that buffer onto a QUIC stream.
Why one core
Two reasons. Identical envelopes. The gateway only ever sees the same little-endian serde encoding regardless of which SDK produced it. There is no per-language parser to keep in sync — only the C++ core, plus the auto-generated method-id table that gets emitted into each language at build time. Audio without a copy. µ-law / A-law to 16-bit PCM conversion is done in SIMD inside the core. Browsers and Node alike call into the same WASM build of that core, so a barge-in from a browser tab and a barge-in from a Python process traverse the same code path with the same latency profile.QUIC framing
Streams are typed by direction:| Direction | Carries |
|---|---|
| Client → server, bidi | RPC request envelopes (Originate, Terminate, …) |
| Client → server, uni | Outbound AudioFrame (microphone → trunk) |
| Server → client, uni | CallEvent and inbound AudioFrame (trunk → app) |
length = 4 + len(body) (it covers the method id and the body, but
not itself). See Envelope Format for full detail.
Connection lifecycle
- Connect. SDK dials
quic://engine.telequick.dev:9090with ALPNh3. - Handshake. SDK signs a JWT from the service account
(
TELEQUICK_CREDENTIALS) and embeds it on the first envelope. - Subscribe. SDK sends
EventStreamRequestwith a per-processclient_idso the gateway knows which uni-stream to direct events at. - RPCs. Each call (Originate, Terminate, Barge, …) opens a fresh bidirectional stream, writes one envelope, and closes the write half.
- Streaming. Audio and events arrive on server-initiated uni-streams
demuxed by
method_id.
Where things run
- Browsers. TypeScript SDK → WebTransport over HTTP/3 for control RPCs; WebRTC (DTLS-SRTP) for media when a low-latency mic/speaker pipe is needed. The C++ core is compiled to WebAssembly via Emscripten.
- Servers. Native SDKs load
telequick_core_ffi.so/.dlland reach the gateway over plain QUIC. JNI for Java, P/Invoke for .NET, CGO for Go,libloadingfor Rust,ctypesfor Python.
Transport surfaces
The gateway accepts media on three transports — pick whichever your runtime supports best:| Transport | Best for |
|---|---|
| QUIC | Native SDKs (Python, Go, Rust, Java, .NET); lowest overhead. |
| WebTransport | Browsers, when control-plane multiplexing matters more than peer-to-peer media. |
| WebRTC | Browsers and any client that needs ICE/STUN traversal, congestion control, or interop with existing WebRTC stacks (LiveKit, Daily, Chime). |
Direct-media
For server-side AI calls (default_app=AI_BIDIRECTIONAL_STREAM), the
gateway and the agent runtime now use a direct-media path. The
gateway negotiates SIP signalling with the carrier, then publishes the
SDP answer pointing to the agent runtime’s RTP socket. RTP flows
straight from the carrier to the runtime — the gateway is signalling-only
on this path. PR1 of the redesign (the pool primitive + env-gated
dispatch splice) shipped 2026-05-05 and the first end-to-end audible
OpenAI Realtime call landed 2026-05-06.
For SIP/RTP-only calls (no AI bridge), the gateway still terminates RTP
and runs libfvad locally. So the same call_sid can take either RTP path
depending on default_app.
Code generation
Method IDs and DTO definitions in every language come from one IDL file (api/telequick.json) compiled by apirpc_compiler.py. To add a new RPC,
edit the IDL and rerun the compiler — never edit the per-language method
tables by hand.