There's a race condition that may cause a client crash whenever a
video-provider's unmount procedure is run, but its signalling websocket
is undefined. The WS's callback handlers are re-assigned without
checking for the socket's availability, causing an unhandled TypeError.
The WS may be undefined in a couple of scenarios, e.g.: unmouting before
the socket was successfully set up, unmounting while a reconnect is in
place etc.
Check whether the socket exists before accessing it in video-provider's
componentWillUnmount routine.
The original debounce implementation (lodash) preserved the
caller's context - radash didn't, so it was failing and it wasn't noticed.
The new debounce implementation with the native function seems to preserve caller's context, but as a safety measure this commit binds the method to its appropriate scope.
There's a scenario where remote streams won't be attached again if the
sharer experienced a Meteor/client disconnection.
The disconnection empties some necessary user data temporarily, which
causes the corresponding video-list-item to be unmounted while the peer
persists for a little longer.
If the sharer re-connects fast enough, video-list-item will re-mount but
will 1) miss the current stream state (ie stuck in loading) 2) fail to
re-attach the streams since the peer was already flagged as attached.
Ensure remote camera streams are always attached and shown by:
- always propagating the current stream state on attachment
- refactoring the attachment pre-requisites away from a static boolean
to a required data + diff check (based on target and current
attached streams)
There's a very rare scenario where the client may crash if a video
publisher is released before the source stream had its inactivation
listener callback set up.
This adds a type check to the inactivation handler before trying to
clean it up.
There's an edge case in finnicky networks where ALG-like firewalls
tamper with USE-CANDIDATE STUN packets and, consequently, bork ICE-lite
connectivity establishment. The odd part is that client-side gathering
seems to complete if intermediate STUN bindings work (before the final
USE-CANDIDATE), which may cause the peer not to generate relay
candidates == connectivity fails.
This adds the `public.kurento.gatheringTimeout` option to forcefully extend
the candidate gathering window in peers that act as offerers. The
behavior is as follows: if the flag is set (ms), the peer will wait
either the gathering completed stage or, _at most_,
public.kurento.gatheringTimeout ms before proceeding with calls chained
to setLocalDescription.
This option is disabled by default and intentionally ommited from the
base settings.yml file as to not encourage its use. Don't use it unless
you know what you're doing :).
There are still a bunch of edge cases and issues with reconnection
scenarios for video:
- Signaling socket refuses to reconnect once maxRetries expire
- Race conditions on local stream attachment: local camera wouldn't be
correctly rendered _if_ the attached stream existed _without_ video
tracks yet
- Video tracks leak on local streams when replacing them (virtual bgs)
- Completely ignoring Meteor state when trying to reconnect cameras
- Streams aren't proactively stopped when the signaling socket dies
- Outbound request queues aren't isolated by stream nor are they
flushed when a newer peer with the same ID is created
- Server originated negotiation errors won't trigger a local peer
cleanup - thus leaving dangling peers that take way too long to
reconnect
This commit fixes or improves all of the aforementioned issues, +:
- Remove unused arguments in the peer (client->SFU) 'start' request
- Prevent crashes when trying to render video-list-items without user
data (which might happen on re-connections)
video-provider's current ping-pong is as good as nothing in 2.5+. We
were counting on Meteor's (and consequently the component's mount state)
before 2.5 to act as a "heartbeat" as far as the socket is concerned.
The ping-pong served only to sustain traffic for finnicky,
traffic-dependant firewall.
Since 2.5, the component's state is _kind of_ detached from Meteor's -
which means it won't unmount when Meteor disconnects. That causes the
video-provider websocket to lose its borrowed heartbeat and leads to a
bunch of reconnectiong inconsistencies, the worst of them being a stuck,
useless signaling socket that will cause cameras not to work until a
client refresh.
This commit does the following:
- Implements actual heartbeat checks to trigger signaling socket
reconnects when necessary, all within the scope of video-provider
- Remove borked, eons old 'offline'/'online' event handlers: they were
causing unnecessary camera drops AND causing video-provider to
generate a stuck signaling socket
- Properly catch WebSockets.send errors
Firefox has a buggy ICE implementation and needs WebRTC media traffic to
be routed through a turn server to work reliably with mediasoup.
Use the information fetched by the STUN API to determine if the operator
has configured a turn server. If there is one force firefox to use it.
Closes#16164
Under some specific scenarios, the virtual background restore code might
kick in _before_ the preloaded MediaStream is actually assigned to the
peer instance. That causes a crash because the peer attachment code
(where the vbg restore is triggered) tries to access the device ID of
said MediaStream - and it is undefined in the first place because it was
only being set in after the initial offer is generated which is an async
procedure.
This commit guarantees a preloaded stream is set before the peer is
flagged as created - so it's accessible by the attachment code.
It also checks whether there's a MediaStream available when trying to
restore VBGs to prevent undefined behaviors.
RTCRTPSender exposes DSCP marking via `networkPriority` in the encodings
configuration dictionaries. That should allow us to control
QoS priorities for different media streams, eg audio with higher network
priority than video. The only browser that implements that right
now is Chromium.
To use this, the public.app.media.networkPriorities configuration in
settings.yml. Audio, camera and screenshare priorities can be controlled
separately. For further info on the possible values, see:
- https://www.w3.org/TR/webrtc-priority/
- https://datatracker.ietf.org/doc/html/rfc8837#section-5
The 'inactive' event is fired whenever the stream gets inactive (ie it
cannot be used anymore), and there are scenarios where that is
unexpected behavior and must be handled accordingly.
The main example of that is when gUM permissions are revoked by the user
via the browser's permission management panel.
Since MediaStream/Track inactive events aren't being handled in such
scenarios, what actually happens is that the camera just freezes without
further indication why.
This commit handles those scenarios in both video-preview and
video-provider by:
- 1) correctly stopping the camera (provider)
- 2) surfacing a toast (provider) or error indication (preview)
New features:
- A simplified echo test mode that only does a local loopback (instead of
going to FS and back)
- A volume meter for microphone streams to the AudioSettings view
Those two features are experimental and disabled by default; see
public.app.media.simplifiedEchoTest and public.app.media.showVolumeMeter configs
Collateral changes:
- fix: localize fallback device strings in AudioSettings/DeviceSelector
- Refactor on some media stream utils to be re-usable across components
- Refactor in AudioSettings to keep gUM #uses stable.
* TODO: need to pass streams through AudioManager to avoid the surplus gUM.
- fix(audio): drop ScriptProcessorNode usage (deprecated)
* Used in volume meter for tracking - use hark instead
video-provider is a race-condition prone mess and those callbacks were missing try-catches so eventual failures would bubble up as uncaught errors and not be logged properly
It`s worth mentioning that due to a number of tangential design failures in that component, 90% of the errors generate are, to the end user, invisible false positives. Thus: expect an increase in false-positive errors logs with this
The base peer object reference was moved from the component to service for _reasons_
That caused an issue where the component lifecycle would mess up that
centralized reference dictionary on certain conditions. That could cause viewing
sessions to fail intermittently
This reverts the location of the base dictionary reference back to
video-provider/component
- forceRelayOnFirefox: whether TURN/relay usage should be forced to work
around Firefox's lack of support for regular nomination when dealing with
ICE-litee peers (e.g.: mediasoup).
* See: https://bugzilla.mozilla.org/show_bug.cgi?id=1034964
- iOS endpoints are ignored from the trigger because _all_ iOS browsers
are either native WebKit or WKWebView based (so they shouldn't be affected)
ICE lite servers (eg mediasoup) dont need candidates signaled out-of-band; neither does KMS in certain scenarios
Disable their signaling saves us some ticks in bbb-webrtc-sfu and some bandwidth all around
There could be a race condition where a peer creation (async) would resolve after the provider was unmounted
That would lead to a state inconsistency which could generate all sorts of cryptic problems
We now retrieve update information about active video peers, and calculates
download and upload rates. These rates are the sum of data transfered in
all video peers.
Screenshare stats is not being added to the sum, yet.
Let the server generate subscriber offers so that it becomes easier for us to
do payload type normalization between publishers and subscribers
Also split peer creation for pubs and subs into separate methods for clarity