The current Vosk CC provider does not support stereo mic streams
(pending investigation as to why).
This commits makes sure stereo is forcefully disabled via SDP munging
only when transcription is active and using Vosk. Having it disabled
in the server side (FreeSWITCH) is not enough because the stereo parameter
is client mandated and replicated by FS on its answer. So we need to
make sure it's always disabled for the time being.
SFU audio does munging server side (and stereo is always off), so no changes
needed there.
The rest of the providers (except WebSpeech) need to be validated against
stereo audio as well.
This is also intended to be temporary - ideally this needs to be fixed in
mod_audio_fork/Vosk/wherever this is breaking.
Audio state callback and remote media setup both depend on FS's state
(comes through Meteor) and the ICE state (local, peer connection). The
caveat: FS's state can come delayed on reconnection scenarios because
Meteor's websocket generally takes significantly longer to re-connect than
the peer connection, which means the ICE state gets completed way before FS
is flagged as ready.
The practical issue: while outbound audio (client -> FS) will work, inbound
audio (FS -> client) won't _just because it wasn't played_ (even though
data is coming through).
This commit decouples the remote media setup step from the state
through:
- Setup remote media when ICE state is completed
- Run the state callback only after FS is flagged as ready. This
should maintain the UI states consistent across client-server.
Keep in mind the assumption that if FS is ready, ICE is completed by
consequence.
There's an edge case in finnicky networks where ALG-like firewalls
tamper with USE-CANDIDATE STUN packets and, consequently, bork ICE-lite
connectivity establishment. The odd part is that client-side gathering
seems to complete if intermediate STUN bindings work (before the final
USE-CANDIDATE), which may cause the peer not to generate relay
candidates == connectivity fails.
This adds the `public.kurento.gatheringTimeout` option to forcefully extend
the candidate gathering window in peers that act as offerers. The
behavior is as follows: if the flag is set (ms), the peer will wait
either the gathering completed stage or, _at most_,
public.kurento.gatheringTimeout ms before proceeding with calls chained
to setLocalDescription.
This option is disabled by default and intentionally ommited from the
base settings.yml file as to not encourage its use. Don't use it unless
you know what you're doing :).
Mostly benign, but exitAudio/forceExitAudio was throwing an unhandled
error when called on sessions with no active audio because the
underlying bridge methods did not check whether there was an active
session to stop beforehand.
There are some situations where previously set deviceIds (
local/session storage) may become stale. This causes an unexpected
behavior where audio is temporarily borked until the user clears their
local storage.
This issue has been seen more recently on Safari endpoints when switching
back-and-forth breakout rooms in environments running under iframes.
Also seen randomly on endpoints with virtual input devices.
This centralizes audio gUM calling into a single method that retries the
gUM procedure without pre-set deviceIds only if the initial call fails
due with an OverconstrainedError - hopefully circumventing the issue.
There's no rollback procedure in case a device switch fails right now,
nor does the code entrypoints that call the switching procedures wait
for resolution or failure before marking the new device as chosen. That
may cause inconsistent states in a couple of ways:
- No rollback: switch fails, audio is still on but no actual
microphone input is being transmitted
- Not waiting for resolutions: inconsistent chosen devices on failures
Device switching errors are also not surfaced to the end user
This commit:
- Adds device rollback and proper resolution/failure response
awaits to try and make the state a bit more consistent.
- Centralizes the input device switching code to be reused between
different bridges
- Centralizes device ID state management in audio-manager to try and
mantain them a bit more consistent across the board
- Surface device switching failures to the end user
- Guarantee device IDs are set to the session storage on all
appropriate scenarios
RTCRTPSender exposes DSCP marking via `networkPriority` in the encodings
configuration dictionaries. That should allow us to control
QoS priorities for different media streams, eg audio with higher network
priority than video. The only browser that implements that right
now is Chromium.
To use this, the public.app.media.networkPriorities configuration in
settings.yml. Audio, camera and screenshare priorities can be controlled
separately. For further info on the possible values, see:
- https://www.w3.org/TR/webrtc-priority/
- https://datatracker.ietf.org/doc/html/rfc8837#section-5
Sometimes the handler that listens for the state change in the callState is
not updated correctly.
In these rare cases, the state of the callstate changes directly to in_conference,
not taking the expected path: call_started -> in_echo_test -> in_conference
There are scenarios where the full audio broker (SFU) stop procedure
may be called multiple times in a very short timestamp - eg a concurrent
stop + connection failure; a timeout in the transfer procedure + a
reconnect attempt, [...]. When that happens, calls to exitAudio may throw
errors if the broker was already released - and that's not the expected
behavior.
FreeSWITCH has mDNS resolution capabilities as of 1.10.7. Having the filtering
configurable in the client allows us to field trial whether we should keep that
on or off. The default is still to filter them out because FreeSWITCH does not
resolve mDNS candidates by default (ice_resolve_candidate in switch.conf.xml).
- Remove the old listen only bridge (kurento.js), superseded by the equivalent
and equally stable (AS FAR AS LISTEN ONLY IS CONCERNED) sfu-audio-bridge
- Rename FullAudioBridge.js -> sfu-audio-bridge.js
* A more generic name that better represents the capabilities and
the nature of the bridge
* The bridge name identifier in configuration is still the same
('fullaudio')
- Remove the FreeSWITCH listen only fallback
- Temporarily disable the "trickle ICE" pair gathering feature used
in SIP.js (which was always experimental, nonstandard and disabled
by default)
- Updates to settings.yml keys in places where relevant
"default" is not an universally valid default value for deviceIds which was causing issues with Firefox and Safari in some specific scenarios where exact deviceId constraints were being used
- forceRelayOnFirefox: whether TURN/relay usage should be forced to work
around Firefox's lack of support for regular nomination when dealing with
ICE-litee peers (e.g.: mediasoup).
* See: https://bugzilla.mozilla.org/show_bug.cgi?id=1034964
- iOS endpoints are ignored from the trigger because _all_ iOS browsers
are either native WebKit or WKWebView based (so they shouldn't be affected)