We currently use full renegotiation for audio, video, and screen sharing
reconnections, which involves re-creating transports and signaling channels
from scratch. While effective in some scenarios, this approach is slow and,
especially with outbound cameras and screen sharing, prone to failures.
To counter that, WebRTC provides a mechanism to restart ICE without needing
to re-create the peer connection. This allows us to avoid full renegotiation
and bypass some server-side signaling limitations. Implementing ICE restart
should make outbound camera/screen sharing reconnections more reliable and
faster.
This commit implements the ICE restart procedure for all WebRTC components'
*outbound* peers. It is based on bbb-webrtc-sfu >= v2.15.0-beta.0, which
added support for ICE restart requests. This feature is *off by default*.
To enable it, adjust the following flags:
- `/etc/bigbluebutton/bbb-webrtc-sfu/production.yml`: `allowIceRestart: true`
- `/etc/bigbluebutton/bbb-html5.yml`: `public.kurento.restartIce`
* Refer to the inline documentation; this can be enabled on the client side
per media type.
* Note: The default max retries for audio is lower than for cameras/screen
sharing (1 vs 3). This is because the full renegotiation process for audio
is more reliable, so ICE restart is attempted first, followed by full
renegotiation if necessary. This approach is less suitable for cameras/
screen sharing, where longer retry periods for ICE restart make sense
since full renegotation there is... iffy.
Endpoints that are inbound/`recvonly` only (client's perspective) do *not*
support ICE restart yet. There are two main reasons:
- Server-side changes are required to support `recvonly` endpoints,
particularly the proper handling of the server’s `setup` role in the
its SDPs during an ICE restart. These changes are too broad for now,
so they are deferred to future releases (SFU@v2.16).
- Full reconnections for `recvonly` endpoints are currently reliable,
unlike for `send*` endpoints. ICE restarts could still provide benefits
for `recvonly` endpoints, but we need the server updates first.
This is a rework of the audio join procedure whithout the explict listen
only separation in mind. It's supposed to be used in conjunction with
the transparent listen only feature so that the distinction between
modes is seamless with minimal server-side impact. An abridged list of
changes:
- Let the user pick no input device when joining microphone while
allowing them to set an input device on the fly later on
- Give the user the option to join audio with no input device whenever
we fail to obtain input devices, with the option to try re-enabling
them on the fly later on
- Add the option to open the audio settings modal (echo test et al)
via the in-call device selection chevron
- Rework the SFU audio bridge and its services to support
adding/removing tracks on the fly without renegotiation
- Rework the SFU audio bridge and its services to support a new peer
role called "passive-sendrecv". That role is used by dupled peers
that have no active input source on start, but might have one later
on.
- Remove stale PermissionsOverlay component from the audio modal
- Rework how permission errors are detected using the Permissions API
- Rework the local echo test so that it uses a separate media tag
rather than the remote
- Add new, separate dialplans that mute/hold FreeSWITCH channels on
hold based on UA strings. This is orchestrated server-side via
webrtc-sfu and akka-apps. The basic difference here is that channels
now join in their desired state rather than waiting for client side
observers to sync the state up. It also mitigates transparent listen
only performance edge cases on multiple audio channels joining at
the same time.
The old, decoupled listen only mode is still present in code while we
validate this new approach. To test this, transparentListenOnly
must be enabled and listen only mode must be disable on audio join so
that the user skips straight through microphone join.
This is an initial, experimental implementation of the feature proposed in
https://github.com/bigbluebutton/bigbluebutton/issues/14021.
The intention is to phase out the explicit listen only mode with two
overarching goals:
- Reduce UX friction and increase familiarity: the existence of a separate
listen only mode is a source of confusion for the majority of users
Reduce average server-side CPU usage while also making it possible for
having full audio-only meetings.
The proof-of-concept works based on the assumption that a "many
concurrent active talkers" scenario is both rare and not useful. With
that in mind, this including two server-side triggers:
- On microphone inactivity (currently mute action that is sustained for
4 seconds, configurable): FreeSWITCH channels are held (which translates
to much lower CPU usage, virtually 0%). Receiving channels are switched,
server side, to a listening mode (SFU, mediasoup).
* This required an extension to mediasoup two allow re-assigning producers
to already established consumers. No re-negotiation is done.
- On microphone activity (currently unmute action, immediate):
FreeSWITCH channels are unheld, listening mode is deactivated and the
mute state is updated accordingly (in this order).
This is *off by default*. It needs to be enabled in two places:
- `/etc/bigbluebutton/bbb-webrtc-sfu/production.yml` ->
`transparentListenOnly: true`
- End users:
* Server wide: `/etc/bigbluebutton/bbb-html5.yml` ->
`public.media.transparentListenOnly: true`
* Per user: `userdata-bbb_transparent_listen_only=true`
In peer.js offer generation method, the provided constraitns are only
being type checked. It needs a value check when the constraint is a
boolean.
This commit should prevent useless transceivers from being added when
explictly specifying audio/video as `false`
There's an edge case in finnicky networks where ALG-like firewalls
tamper with USE-CANDIDATE STUN packets and, consequently, bork ICE-lite
connectivity establishment. The odd part is that client-side gathering
seems to complete if intermediate STUN bindings work (before the final
USE-CANDIDATE), which may cause the peer not to generate relay
candidates == connectivity fails.
This adds the `public.kurento.gatheringTimeout` option to forcefully extend
the candidate gathering window in peers that act as offerers. The
behavior is as follows: if the flag is set (ms), the peer will wait
either the gathering completed stage or, _at most_,
public.kurento.gatheringTimeout ms before proceeding with calls chained
to setLocalDescription.
This option is disabled by default and intentionally ommited from the
base settings.yml file as to not encourage its use. Don't use it unless
you know what you're doing :).
There are still a bunch of edge cases and issues with reconnection
scenarios for video:
- Signaling socket refuses to reconnect once maxRetries expire
- Race conditions on local stream attachment: local camera wouldn't be
correctly rendered _if_ the attached stream existed _without_ video
tracks yet
- Video tracks leak on local streams when replacing them (virtual bgs)
- Completely ignoring Meteor state when trying to reconnect cameras
- Streams aren't proactively stopped when the signaling socket dies
- Outbound request queues aren't isolated by stream nor are they
flushed when a newer peer with the same ID is created
- Server originated negotiation errors won't trigger a local peer
cleanup - thus leaving dangling peers that take way too long to
reconnect
This commit fixes or improves all of the aforementioned issues, +:
- Remove unused arguments in the peer (client->SFU) 'start' request
- Prevent crashes when trying to render video-list-items without user
data (which might happen on re-connections)
RTCRTPSender exposes DSCP marking via `networkPriority` in the encodings
configuration dictionaries. That should allow us to control
QoS priorities for different media streams, eg audio with higher network
priority than video. The only browser that implements that right
now is Chromium.
To use this, the public.app.media.networkPriorities configuration in
settings.yml. Audio, camera and screenshare priorities can be controlled
separately. For further info on the possible values, see:
- https://www.w3.org/TR/webrtc-priority/
- https://datatracker.ietf.org/doc/html/rfc8837#section-5
kurento-utils is unmaintained. It's served us well, but its age
shows. We need to transition to something else if we want to
have better maintainability and include simulcast, multistream, ...
This introduces a simplified/leaner wrapper kit that's almost
API-compatible with what we use right now - so widespread changes
are minimal). It's easier to maintain/read/transition from. This
can be read as an intermediate step to transitioning to
something definitive (ie mediasoup-client).