We should be able to capture WebRTC stats in some form for post-processing
so that it helps on debugging support requests (and other use cases, e.g.:
improving field trial analysis on test servers).
Although much of WebRTC stats information can be gathered via server side
components, none have logs as structured for proper post-processing as
the client logs - so we're going the client route for now.
Capture WebRTC stats information for audio and screen sharing via:
- Audio logCodes: new `stats` extraInfo field
- `audio_joined`
- `audio_failure`
- `sfuaudio_error_retry_through_relay`
- `sfuaudio_error_try_to_reconnect`
- Screen share logCodes: new `stats` extraInfo field
- screenshare_presenter_start_success
- screenshare_viewer_start_success
- screenshare_broker_failure
Additionally, add an option to periodically capture WebRTC stats information
for all relevant peers. This is disabled by default since the log can be
verbose (and, consequentially, network taxing when using external
logging targets). It can be enabled via `public.stats.logMediaStats` in
settings.yml. The default interval is 30s. The periodic log format is as
follows:
- logCode: `mediaStats`
- extraInfo.stats: an aggregated stats object of all peers (equivalent
to the `Copy` function in the Connection Status modal).
There are some situations where previously set deviceIds (
local/session storage) may become stale. This causes an unexpected
behavior where audio is temporarily borked until the user clears their
local storage.
This issue has been seen more recently on Safari endpoints when switching
back-and-forth breakout rooms in environments running under iframes.
Also seen randomly on endpoints with virtual input devices.
This centralizes audio gUM calling into a single method that retries the
gUM procedure without pre-set deviceIds only if the initial call fails
due with an OverconstrainedError - hopefully circumventing the issue.
There's no rollback procedure in case a device switch fails right now,
nor does the code entrypoints that call the switching procedures wait
for resolution or failure before marking the new device as chosen. That
may cause inconsistent states in a couple of ways:
- No rollback: switch fails, audio is still on but no actual
microphone input is being transmitted
- Not waiting for resolutions: inconsistent chosen devices on failures
Device switching errors are also not surfaced to the end user
This commit:
- Adds device rollback and proper resolution/failure response
awaits to try and make the state a bit more consistent.
- Centralizes the input device switching code to be reused between
different bridges
- Centralizes device ID state management in audio-manager to try and
mantain them a bit more consistent across the board
- Surface device switching failures to the end user
- Guarantee device IDs are set to the session storage on all
appropriate scenarios