WebRTC / WHEP Protocol
shiloh-web-relay exposes three audio streams as WebRTC sessions using the
WebRTC-HTTP Egress Protocol (WHEP).
WHEP is a minimal HTTP signaling shim over standard WebRTC: a single POST
exchange completes negotiation, after which audio flows over SRTP/RTP with no
further HTTP traffic.
Architecture overview
shiloh-mixer (:5005 UDP)
|
| AUDIO packets (S16LE, 128 frames stereo, 375 pps deployed; 160 frames code default)
| 3 sessions: web-relay-main, web-relay-monitor, web-relay-cue
v
relay_sub (blocking UDP thread per feed)
|
| Vec<f32> stereo interleaved, 48 kHz
v
encode worker (tokio task per feed)
|
| Opus frames (960 samples = 20 ms, 128 kbps VBR, stereo)
v
TrackLocalStaticSample (shared Arc, one per feed)
|
| webrtc-rs RTP packetization + fan-out
v
PeerConnection (one per browser tab / WHEP session)
|
| SRTP/RTP over ICE/UDP
v
Browser
One Opus encoder runs per feed. All PeerConnections on the same feed share the
same TrackLocalStaticSample — encoding happens once regardless of
listener count.
WHEP endpoint
POST http://<host>:8890/whep/<feed>
Content-Type: application/sdp
<SDP offer body>
<host>: IP or hostname of theshiloh-web-relayinstance- Default HTTP port: 8890 (configurable with
--http) <feed>: one ofmain,monitor,cue
Request:
- Method:
POST - Content-Type:
application/sdp - Body: RFC 8866 SDP offer generated by the browser’s
RTCPeerConnection.createOffer()
Response (success):
- Status:
201 Created - Content-Type:
application/sdp - Body: RFC 8866 SDP answer with all ICE candidates included (no trickle)
Response (error):
| Status | Body | Cause |
|---|---|---|
404 Not Found |
unknown feed |
<feed> is not main, monitor, or cue |
503 Service Unavailable |
feed not active |
Feed track not initialized (startup race) |
500 Internal Server Error |
negotiation failed |
ICE gathering or DTLS setup error |
CORS headers are set to Allow-Origin: *, Allow-Methods: *,
Allow-Headers: * — any browser origin can make the WHEP request.
Codec parameters
| Parameter | Value |
|---|---|
| Codec | Opus |
| MIME type | audio/opus |
| RTP payload type | 111 |
| Clock rate | 48000 Hz |
| Channels | 2 (stereo) |
| Frame size | 960 samples (20 ms) |
| Bitrate | 128 kbps VBR |
| In-band FEC | enabled (useinbandfec=1) |
| Stereo extension | enabled (stereo=1;sprop-stereo=1) |
| Minimum packet time | 10 ms (minptime=10) |
SDP fmtp line negotiated:
a=fmtp:111 minptime=10;useinbandfec=1;stereo=1;sprop-stereo=1
ICE configuration
STUN
The server uses stun:stun.l.google.com:19302 for its own ICE candidate
gathering. The browser’s ICE agent uses whatever STUN/TURN servers are
configured in its RTCConfiguration (typically none for LAN use).
Candidate types
By default shiloh-web-relay gathers all local host candidates and lets
ICE negotiate the best path. For deployments with private interfaces (Hetzner
link-local, Docker bridges, WireGuard tap) that must not appear in SDP
answers, the --announce-ip flag suppresses all other host addresses:
shiloh-web-relay --announce-ip 192.0.2.10
This uses a 1:1 NAT mapping (RTCIceCandidateType::Host override) so the
specified IP appears as the sole Host candidate.
Multiple announce IPs are allowed (comma-separated):
shiloh-web-relay --announce-ip 192.0.2.10,2001:db8::1
Network type restriction
By default all four ICE transport types are enabled: udp4, udp6, tcp4,
tcp6. To restrict to UDP/IPv4 only (e.g. to avoid IPv6 or TCP fallback):
shiloh-web-relay --ice-network-types udp4
No trickle ICE
The server waits up to 3 seconds for ICE gathering to complete before
returning the 201 response. The SDP answer returned to the browser contains
all candidates — no a=ice-options:trickle and no subsequent candidate
HTTP requests. This keeps browser-side JavaScript simple: no onicecandidate
handler or addIceCandidate calls are needed.
Negotiation flow
Browser shiloh-web-relay (:8890)
| |
| RTCPeerConnection.createOffer |
| |
| POST /whep/main |
| Content-Type: application/sdp |
| Body: <SDP offer> |
|------------------------------->|
| | create RTCPeerConnection
| | add TrackLocalStaticSample
| | setRemoteDescription(offer)
| | createAnswer()
| | setLocalDescription(answer)
| | [wait ICE gathering ≤ 3s]
| 201 Created |
| Content-Type: application/sdp |
| Body: <SDP answer + candidates>
|<-------------------------------|
| |
| RTCPeerConnection.setRemoteDescription(answer)
| |
| [ICE connectivity checks] |
|<=============================>| (UDP SRTP/RTP)
| |
| [Opus audio flowing] |
|<==============================|
After the HTTP exchange completes, all signaling is done. The PeerConnection
is held in memory by shiloh-web-relay until the browser disconnects or the
connection fails.
Lifecycle and fan-out
Each POST /whep/<feed> creates one RTCPeerConnection. The connection
attaches to the shared TrackLocalStaticSample for that feed. When the
connection state transitions to Failed, Closed, or Disconnected, the
PeerConnection is closed and its resources released. No explicit DELETE
endpoint is needed.
Fan-out is handled at the TrackLocalStaticSample level inside webrtc-rs:
each encoded Opus frame is written to the shared track once, and the library
delivers it to every attached RTCRtpSender. One Opus encoder serves N
concurrent listeners at zero additional encoding cost.
Browser compatibility
WHEP with Opus over WebRTC is supported by all modern browsers. Tested
configurations:
| Browser | Notes |
|---|---|
| Chrome / Chromium | Full support. Opus stereo, in-band FEC |
| Firefox | Full support |
| Safari (macOS 14+) / iOS 17+ | Full support. Safari requires a user gesture before calling play() on an HTMLMediaElement if autoplay is not granted |
Minimal browser snippet:
const pc = new RTCPeerConnection();
pc.addTransceiver('audio', { direction: 'recvonly' });
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
const resp = await fetch('http://mixer-host:8890/whep/main', {
method: 'POST',
headers: { 'Content-Type': 'application/sdp' },
body: offer.sdp,
});
if (!resp.ok) throw new Error(`WHEP error: ${resp.status}`);
const answer = await resp.text();
await pc.setRemoteDescription({ type: 'answer', sdp: answer });
// Wire to an audio element:
pc.ontrack = (e) => {
const audio = document.createElement('audio');
audio.srcObject = e.streams[0];
audio.autoplay = true;
document.body.appendChild(audio);
};
No trickle ICE handling is needed because the server waits for gathering to
complete before returning the answer.
Latency
End-to-end latency from the mix bus to the browser is approximately:
| Stage | Contribution |
|---|---|
| Relay UDP receive (one 128-frame packet, 2.67 ms deployed; 160-frame code default = 3.33 ms) | ~2.67 ms |
| Opus frame accumulation (960 samples = 20 ms) | ~20 ms |
| ICE/SRTP network transit (LAN) | < 1 ms |
| Browser audio output buffer | 20–40 ms |
| Total (LAN) | ~45–65 ms |
Over WAN (with STUN traversal) add typical RTT/2 plus jitter buffering in the
browser’s WebRTC stack (~100–200 ms total is common).