Commit Graph

265 Commits

Author SHA1 Message Date
Anton Georgiev
02bd94a400
Merge pull request #21008 from prlanzarin/u27/feat/ice-restart
feat: add experimental support for ICE restart
2024-09-13 10:24:35 -04:00
prlanzarin
3c4e3de286 feat: add WebRTC stats information to client logs
We should be able to capture  WebRTC stats in some form for post-processing
so that it helps on debugging support requests (and other use cases, e.g.:
improving field trial analysis on test servers).
Although much of WebRTC stats information can be gathered via server side
components, none have logs as structured for proper post-processing as
the client logs - so we're going the client route for now.

Capture WebRTC stats information for audio and screen sharing via:
  - Audio logCodes: new `stats` extraInfo field
    - `audio_joined`
    - `audio_failure`
    - `sfuaudio_error_retry_through_relay`
    - `sfuaudio_error_try_to_reconnect`
  - Screen share logCodes: new `stats` extraInfo field
    - screenshare_presenter_start_success
    - screenshare_viewer_start_success
    - screenshare_broker_failure

Additionally, add an option to periodically capture WebRTC stats information
for all relevant peers. This is disabled by default since the log can be
verbose (and, consequentially, network taxing when using external
logging targets). It can be enabled via `public.stats.logMediaStats` in
settings.yml. The default interval is 30s. The periodic log format is as
follows:
  - logCode: `mediaStats`
  - extraInfo.stats: an aggregated stats object of all peers (equivalent
    to the `Copy` function in the Connection Status modal).
2024-08-27 14:00:26 -03:00
prlanzarin
d2dde8a9b1 feat: add experimental support for ICE restart
We currently use full renegotiation for audio, video, and screen sharing
reconnections, which involves re-creating transports and signaling channels
from scratch. While effective in some scenarios, this approach is slow and,
especially with outbound cameras and screen sharing, prone to failures.

To counter that, WebRTC provides a mechanism to restart ICE without needing
to re-create the peer connection. This allows us to avoid full renegotiation
and bypass some server-side signaling limitations. Implementing ICE restart
should make outbound camera/screen sharing reconnections more reliable and
faster.

This commit implements the ICE restart procedure for all WebRTC components,
based on bbb-webrtc-sfu >= v2.15.0-beta.0, which added support for ICE restart
requests. This feature is off by default. To enable it, adjust the following
flags:

- `/etc/bigbluebutton/bbb-webrtc-sfu/production.yml`: `allowIceRestart: true`
- `/etc/bigbluebutton/bbb-html5.yml`: `public.kurento.restartIce`
  * Refer to the inline documentation; this can be enabled on the client side
    per media type.
  * Note: The default max retries for audio is lower than for cameras/screen
    sharing (1 vs 3). This is because the full renegotiation process for audio
    is more reliable, so ICE restart is attempted first, followed by full
    renegotiation if necessary. This approach is less suitable for cameras/
    screen sharing, where longer retry periods for ICE restart make sense
    since full renegotation there is... iffy.
2024-08-23 09:59:51 -03:00
prlanzarin
b8a1b881c5 fix(audio): clear connection timeout on autoplay failures
If the autoplay block is triggered in listen only, the connection timer
keeps ticking even if the user correctly accepts the audio play prompt.
That causes an audio re-connect once the timeout expires.

Clear the connection timer if the audio bridge starts with
NotAllowedError as a soft error. For connection purposes, the audio join
procedure worked. The autoplay thing is at the UI/UX level, not WebRTC.
2023-08-09 11:09:27 -03:00
prlanzarin
8feb934169 feat(audio): add experimental transparent listen only mode
This is an initial, experimental implementation of the feature proposed in
https://github.com/bigbluebutton/bigbluebutton/issues/14021.

The intention is to phase out the explicit listen only mode with two
overarching goals:
  - Reduce UX friction and increase familiarity: the existence of a separate
  listen only mode is a source of confusion for the majority of users
  Reduce average server-side CPU usage while also making it possible for
  having full audio-only meetings.

The proof-of-concept works based on the assumption that a "many
concurrent active talkers" scenario is both rare and not useful. With
that in mind, this including two server-side triggers:
 - On microphone inactivity (currently mute action that is sustained for
   4 seconds, configurable): FreeSWITCH channels are held (which translates
   to much lower CPU usage, virtually 0%). Receiving channels are switched,
   server side, to a listening mode (SFU, mediasoup).
   * This required an extension to mediasoup two allow re-assigning producers
     to already established consumers. No re-negotiation is done.
 - On microphone activity (currently unmute action, immediate):
   FreeSWITCH channels are unheld, listening mode is deactivated and the
   mute state is updated accordingly (in this order).

This is *off by default*. It needs to be enabled in two places:
  - `/etc/bigbluebutton/bbb-webrtc-sfu/production.yml` ->
    `transparentListenOnly: true`
  - End users:
    * Server wide: `/etc/bigbluebutton/bbb-html5.yml` ->
      `public.media.transparentListenOnly: true`
    * Per user: `userdata-bbb_transparent_listen_only=true`
2023-08-07 19:43:18 -03:00
prlanzarin
a8e4e876d0 fix(audio): add connection timers for SFU audio
SFU based audio is missing connection timers, which means the join
procedure can go on indefinitely in a couple of scenarios.

Refactor the connection timers added for re-connections in the SFU audio
bridge and make them valid for the first try as well.

Make 1010 errors (connection timeout) retriable when retryThroughRelay
is enabled.
2023-07-31 11:39:52 -03:00
prlanzarin
7c3ac51e38 feat(audio): add retryThroughRelay flag for 1007 errors
1007 errors are still a large fraction of our overall audio join error
rate. This usually indicates some sort of firewall block or UDP issues
carrier networks. I can't figure out why some scenarios won't trickle
down to relay candidates though - I'm leaning to scenarios where STUN
packets with USE-CANDIDATE are being mangled/lost along the way or
something else that borks the (already fragile) conn checks for ICE-lite
implementations.

Add a new feature called retryThroughRelay which triggers a retry with
iceTransportPolicy=relay whenever audio fails to join with a 1007 error.
The goal is to force relay usage to try and bypass 1007s scenarios that
still happen.

Disabled by default.
2023-07-31 11:39:45 -03:00
prlanzarin
54b6578b03 fix(audio): forcefully disable stereo when using Vosk transcription
The current Vosk CC provider does not support stereo mic streams
(pending investigation as to why).

This commits makes sure stereo is forcefully disabled via SDP munging
only when transcription is active and using Vosk. Having it disabled
in the server side (FreeSWITCH) is not enough because the stereo parameter
is client mandated and replicated by FS on its answer. So we need to
make sure it's always disabled for the time being.
SFU audio does munging server side (and stereo is always off), so no changes
needed there.

The rest of the providers (except WebSpeech) need to be validated against
stereo audio as well.
This is also intended to be temporary - ideally this needs to be fixed in
mod_audio_fork/Vosk/wherever this is breaking.
2023-04-25 10:10:39 -03:00
prlanzarin
b17ba35238 fix(audio): decouple remote media setup (play) from state callback
Audio state callback and remote media setup both depend on FS's state
(comes through Meteor) and the ICE state (local, peer connection). The
caveat: FS's state can come delayed on reconnection scenarios because
Meteor's websocket generally takes significantly longer to re-connect than
the peer connection, which means the ICE state gets completed way before FS
is flagged as ready.
The practical issue: while outbound audio (client -> FS) will work, inbound
audio (FS -> client) won't _just because it wasn't played_ (even though
data is  coming through).

This commit decouples the remote media setup step from the state
through:
  - Setup remote media when ICE state is completed
  - Run the state callback only after FS is flagged as ready. This
    should maintain the UI states consistent across client-server.
    Keep in mind the assumption that if FS is ready, ICE is completed by
    consequence.
2023-04-11 16:02:20 -03:00
prlanzarin
be6a23a003 feat: add option to force/extend gathering window in SFU components
There's an edge case in finnicky networks where ALG-like firewalls
tamper with USE-CANDIDATE STUN packets and, consequently, bork ICE-lite
connectivity establishment. The odd part is that client-side gathering
seems to complete if intermediate STUN bindings work (before the final
USE-CANDIDATE), which may cause the peer not to generate relay
candidates == connectivity fails.

This adds the `public.kurento.gatheringTimeout` option to forcefully extend
the candidate gathering window in peers that act as offerers. The
behavior is as follows: if the flag is set (ms), the peer will wait
either the gathering completed stage or, _at most_,
public.kurento.gatheringTimeout ms before proceeding with calls chained
to setLocalDescription.

This option is disabled by default and intentionally ommited from the
base settings.yml file as to not encourage its use. Don't use it unless
you know what you're doing :).
2023-04-05 13:22:38 -03:00
Anton Georgiev
2d742b654c
Merge pull request #15919 from prlanzarin/u26/fix/cam-reconn-issues
fix(webcam): intermittent client crashes when sharing camera (2.6)
2022-12-23 09:12:54 -05:00
Joao Victor
f1007fb7b6 Use the new config option from #15413 - A centralized way of defining which storage to use (Session or Local) 2022-11-03 17:57:54 -03:00
prlanzarin
d839b457d9 fix(audio): check for session availability on exitAudio
Mostly benign, but exitAudio/forceExitAudio was throwing an unhandled
error when called on sessions with no active audio because the
underlying bridge methods did not check whether there was an active
session to stop beforehand.
2022-10-27 16:30:11 +00:00
Joao Victor
6781602420 improvement: store audio setup 2022-10-03 11:03:14 -03:00
prlanzarin
0f24e5634d fix(audio): bypass overconstrained errors in SFU-based audio 2022-09-15 20:42:43 +00:00
prlanzarin
b3eebbb926 fix(audio): retry gUM without pre-set deviceIds on OverconstrainedError(s)
There are some situations where previously set deviceIds (
local/session storage) may become stale. This causes an unexpected
behavior where audio is temporarily borked until the user clears their
local storage.
This issue has been seen more recently on Safari endpoints when switching
back-and-forth breakout rooms in environments running under iframes.
Also seen randomly on endpoints with virtual input devices.

This centralizes audio gUM calling into a single method that retries the
gUM procedure without pre-set deviceIds only if the initial call fails
due with an OverconstrainedError - hopefully circumventing the issue.
2022-09-15 19:25:30 +00:00
prlanzarin
bf802ced4c fix(audio): check if backup stream exists before trying to clean it up 2022-08-25 17:14:41 +00:00
prlanzarin
36bce51363 refactor(audio): remove unused imports from sip.js bridge 2022-08-24 13:28:32 +00:00
prlanzarin
89e814d570 fix(audio): centralize device change code, add rollbacks, surface errors
There's no rollback procedure in case a device switch fails right now,
nor does the code entrypoints that call the switching procedures wait
for resolution or failure before marking the new device as chosen. That
may cause inconsistent states in a couple of ways:
  - No rollback: switch fails, audio is still on but no actual
    microphone input is being transmitted
  - Not waiting for resolutions: inconsistent chosen devices on failures
Device switching errors are also not surfaced to the end user

This commit:
  - Adds device rollback and proper resolution/failure response
    awaits to try and make the state a bit more consistent.
  - Centralizes the input device switching code to be reused between
    different bridges
  - Centralizes device ID state management in audio-manager to try and
    mantain them a bit more consistent across the board
  - Surface device switching failures to the end user
  - Guarantee device IDs are set to the session storage on all
    appropriate scenarios
2022-08-24 13:28:27 +00:00
prlanzarin
0e162f1cda feat: configurable DSCP marking for WebRTC media
RTCRTPSender exposes DSCP marking via `networkPriority` in the encodings
configuration dictionaries. That should allow us to control
QoS priorities for different media streams, eg audio with higher network
priority than video. The only browser that implements that right
now is Chromium.

To use this, the public.app.media.networkPriorities configuration in
settings.yml. Audio, camera and screenshare priorities can be controlled
separately. For further info on the possible values, see:
  - https://www.w3.org/TR/webrtc-priority/
  - https://datatracker.ietf.org/doc/html/rfc8837#section-5
2022-08-15 21:24:05 +00:00
Paulo Lanzarin
a156db25e4
Merge pull request #15358 from frankemax/fix-audio-infinite-joining
fix(audio): prevent race condition when joining audio
2022-07-15 15:24:14 -03:00
prlanzarin
45049cbd65 refactor: swap kurento-utils for new peer wrapper in screen sharing and audio 2022-07-15 14:00:12 +00:00
Max Franke
70bf69182a fix(audio): prevent race condition when joining audio
Sometimes the handler that listens for the state change in the callState is
not updated correctly.
In these rare cases, the state of the callstate changes directly to in_conference,
not taking the expected path: call_started -> in_echo_test -> in_conference
2022-07-11 14:09:43 -03:00
Paulo Lanzarin
240bb9e1cb
Merge pull request #15292 from prlanzarin/u26/fix/audio-undef-broker-onstop
fix(audio): check if broker exists before trying to stop
2022-06-29 15:42:05 -03:00
prlanzarin
f026c397d9 fix(audio): check if broker exists before trying to stop
There are scenarios where the full audio broker (SFU) stop  procedure
may be called multiple times in a very short timestamp - eg a concurrent
stop + connection failure; a timeout in the transfer procedure + a
reconnect attempt, [...]. When that happens, calls to exitAudio may throw
errors if the broker was already released - and that's not the expected
behavior.
2022-06-29 17:44:52 +00:00
prlanzarin
602238b84e refactor(audio): remove caller ID from fullaudio bridge start request
The callerId is assembled server-side as of bbb-webrtc-sfu
v2.9.0-alpha.3 based on the work done in commit
d940bff541b6fe3c4976428ca471457bc67ac97e.
2022-06-28 20:33:36 +00:00
prlanzarin
e93176238a feat(audio): add sipjsAllowMdns option to control mDNS filtering in SIP.js
FreeSWITCH has mDNS resolution capabilities as of 1.10.7. Having the filtering
configurable in the client allows us to field trial whether we should keep that
on or off. The default is still to filter them out because FreeSWITCH does not
resolve mDNS candidates by default (ice_resolve_candidate in switch.conf.xml).
2022-05-06 13:38:44 +00:00
prlanzarin
f5a2c4c8e7 fix(audio): fix change output device error log
this.user.callerIdName doesnt exist; error was logged as in its raw form (wrong)
2022-05-03 14:51:30 +00:00
prlanzarin
6a0e0a87c2 fix(audio): abide to signalCandidates configuration flag 2022-05-02 13:49:47 +00:00
prlanzarin
ccc95583ee refactor(audio): restore trickle candidate filtering in new audio bridge
+ better error handling, log messages for that code
2022-04-25 16:45:18 +00:00
prlanzarin
1decc5d343 fix(audio): respect public.media.listenOnlyOffering in new audio bridge 2022-04-25 16:22:49 +00:00
prlanzarin
459e1a9514 refactor(audio): remove old listen only bridge (kurento.js)
- Remove the old listen only bridge (kurento.js), superseded by the equivalent
  and equally stable (AS FAR AS LISTEN ONLY IS CONCERNED) sfu-audio-bridge
  - Rename FullAudioBridge.js -> sfu-audio-bridge.js
    * A more generic name that better represents the capabilities and
      the nature of the bridge
    * The bridge name identifier in configuration is still the same
      ('fullaudio')
  - Remove the FreeSWITCH listen only fallback
  - Temporarily disable the "trickle ICE" pair gathering feature used
    in SIP.js (which was always experimental, nonstandard and disabled
    by default)
  - Updates to settings.yml keys in places where relevant
2022-04-20 20:46:32 +00:00
prlanzarin
6fd6a52d47 fix(audio): prevent uncaught rejections in the experimental audio bridge startup 2022-04-20 17:40:06 +00:00
prlanzarin
1e80d050b7 refactor(audio): generic use of sfu audio broker to cover mic and listen only 2022-04-20 17:26:52 +00:00
prlanzarin
d125b34117 refactor(audio): address linter warning in FullAudioBridge.js 2022-04-19 19:18:04 +00:00
prlanzarin
3f03a94d29 fix(audio): use correct media server in listen only via fullaudio bridge 2022-04-19 19:16:22 +00:00
prlanzarin
f4ba6dd9a2 refactor(audio): use preloaded audio stream if provided
Avoids a surplus gUM with local echo test et al
2022-04-11 22:29:20 +00:00
prlanzarin
0d85905c83 fix(audio): centralize default in/out device id definitions, make them an empty string
"default" is not an universally valid default value for deviceIds which was causing issues with Firefox and Safari in some specific scenarios where exact deviceId constraints were being used
2022-04-11 19:23:32 +00:00
Anton Georgiev
351b7126c7 Merge branch 'v2.5.x-release' of github.com:bigbluebutton/bigbluebutton into develop 2022-03-23 15:11:47 +00:00
Mario Junior
5778306626
Merge pull request #9283 from znerol-forks/feature/develop/sip-user-agent
Send browser UA string in SIP UA, also add BBB server and client version
2022-03-22 11:18:06 -03:00
prlanzarin
13c6b12f89 fix(audio): clean up sfu broker, guarantee peer exists in getLocalStream 2022-03-17 11:20:19 -03:00
prlanzarin
d9c329df27 refactor(fullaudio): remove server-provided RPC parameters 2022-03-10 14:59:43 -03:00
prlanzarin
b9f9043d9c feat(fullaudio): handle forceRelayOnFirefox flag 2022-03-10 14:31:42 -03:00
prlanzarin
d04d7c92dc feat(fullaudio): handle audio filter/constraint changes in new bridge 2022-02-01 17:19:56 -03:00
prlanzarin
ed89f6e4a5 feat(fullaudio): implement input/output device change in new bridge 2022-02-01 17:19:50 -03:00
prlanzarin
e667f7aecb refactor(fullaudio): make some of the SIP.js input/output code re-usable
To be used by other bridges
2022-01-31 16:30:38 -03:00
prlanzarin
599a5556b5 refactor(listen-only): use this.bridgeName for logging 2022-01-26 11:03:27 -03:00
prlanzarin
cb84e34833 feat(fullaudio): implement echo test in new full audio bridge
Partially addresses https://github.com/bigbluebutton/bigbluebutton/issues/14191
2022-01-26 11:03:27 -03:00
prlanzarin
f4e6e6c4f4 refactor(fullaudio): make call transfer code reusable
Allows state tracking and transfer execution to be re-used by other audio bridges
2022-01-26 11:03:24 -03:00
Ramón Souza
f6e65f58c5 merge 2.4 into develop and resolve conflicts - partial 2022-01-12 16:40:45 +00:00