Fixes a case where the locale selector don't show up in Chrome when using
'webspeech' provider.
And adds missing fields to the webspeech transcription messages, after the
addition of some new parameters to those messages with the open
transcription server.
This feature was too coupled to the old closed captions' pads.
(e.g. the old closed captions feature should be enabled for this
to work properly)
Some things were hardcoded and others didn't make sense from the
user experience perspective.
Reverts #876d8aa.
Partially reverts #802964f, removes changes to make closed captions'
pads compatible with live-transcription but keeps provider settings.
The current Vosk CC provider does not support stereo mic streams
(pending investigation as to why).
This commits makes sure stereo is forcefully disabled via SDP munging
only when transcription is active and using Vosk. Having it disabled
in the server side (FreeSWITCH) is not enough because the stereo parameter
is client mandated and replicated by FS on its answer. So we need to
make sure it's always disabled for the time being.
SFU audio does munging server side (and stereo is always off), so no changes
needed there.
The rest of the providers (except WebSpeech) need to be validated against
stereo audio as well.
This is also intended to be temporary - ideally this needs to be fixed in
mod_audio_fork/Vosk/wherever this is breaking.
Firefox does not support selecting an audio output device in its default
configuration. This works around this flaw by just displaying default
output instead of no device found.
Fixes#16057
Output device changes aren't working in 2.6's echo test when artifical delay
is on due to the fact that the feedback audio is being played via the WebAudio
context rather the the HTMLMediaElement. Since output device change works
via HTMLMediaElement's setSinkId, it's basically a no-op.
This commit fixes the issue by piping the AudioContext destination
through the main audio element rather than using WebAudio directly for
playback. An additional stub media element (muted) is added to circumvent one
of Chrome's WebAudio issue.
The alternative would be to use AudioContext's setSinkId, but it isn't
supported by Firefox (setSinkId enabled) and Chrome < 110.
This should work with FF (setSinkId enabled) and a wide array of Chromium
versions.
There was an observer being linked to all breakout rooms that the user has
access to. This logic works for attendees, but not for moderators.
Moderators have access to the list of all breakout rooms, so they were set
with an observer to breakout rooms that they didn't participate, which caused
the audio and video modals to appear everytime the breakout rooms were closed.
So, this commit:
- hangs an breakout rooms' observer only on those mods that have joined any
breakout room. This way, mods that didn't participate in any breakout
room won't be disturbed by the audio and video modal.
- adds an extra check to ensure that the observer will only be run in
non-breakout meetings.
If BBB 2.6 is used without headphones, the audio test works differently
than in 2.5. In 2.5 audio traffic is routed to freeswitch and then
returned to the browser. This adds usually some latency which makes it
easy to hear you audio quality. In 2.6 there is a local loopback. As
there is almost no latency, it is either difficult or even impossible to
check your own audio quality as echo cancellation of the browser will
filter out your own signal.
This patch adds a delay node to the audio loopback test, which makes is
easier to check your quality.
There are some situations where previously set deviceIds (
local/session storage) may become stale. This causes an unexpected
behavior where audio is temporarily borked until the user clears their
local storage.
This issue has been seen more recently on Safari endpoints when switching
back-and-forth breakout rooms in environments running under iframes.
Also seen randomly on endpoints with virtual input devices.
This centralizes audio gUM calling into a single method that retries the
gUM procedure without pre-set deviceIds only if the initial call fails
due with an OverconstrainedError - hopefully circumventing the issue.
Extract the deviceId again from the stream to guarantee consistency
between stream DID vs chosen DID. That's necessary in scenarios where,
eg, there's no default/pre-set deviceId ('') and the browser's
default device has been altered by the user (browser default != system's
default).
There's no rollback procedure in case a device switch fails right now,
nor does the code entrypoints that call the switching procedures wait
for resolution or failure before marking the new device as chosen. That
may cause inconsistent states in a couple of ways:
- No rollback: switch fails, audio is still on but no actual
microphone input is being transmitted
- Not waiting for resolutions: inconsistent chosen devices on failures
Device switching errors are also not surfaced to the end user
This commit:
- Adds device rollback and proper resolution/failure response
awaits to try and make the state a bit more consistent.
- Centralizes the input device switching code to be reused between
different bridges
- Centralizes device ID state management in audio-manager to try and
mantain them a bit more consistent across the board
- Surface device switching failures to the end user
- Guarantee device IDs are set to the session storage on all
appropriate scenarios
Mobile endpoints are flaky with the WebSpeechAPI:
- iOS versions that support it are borking our outbound audio when it's
enabled
- Android speech recognition has flaky locale detection and speech
transcription
Additionally: the support check is not checking the WebSpeechAPI
availability properly, so older devices (eg iOS 12) are flagged as
supported even though they aren't.
This commit adds a configuration flag (public.audioCaptions.mobile) to
control transcription availability on mobile. False by default.
Also extends the setSpeechVoices support check and
hasSpeechRecognitionSupport method to prevent false positives.
Adds two new flags to the settings file which change the way the locale
flag is used:
- forceLocale: (true/false) => If true, enforces the transcription
language to be the locale content field and jumps the language
selector
in audio modal.
- defaultSelectLocale: (true/false) => If true, the default selected
value in the dropdown language selector in audio modal will be defined
by the locale content field.
In any case, if the locale flag holds an invalid value, it defaults to
disabled.
Move the language collection to the HTML settings file. This data defines
the available languages available for the speech API.
These language tags are used to filter SpeechSynthesis' API `getVoices`
result. Tags must use BCP 47 format.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisVoice/lang
Avoid enable audio transcription if the browser's vendor does not provide
voices data.
This should prevent false positives for browsers such as Chromium and
Brave.
Parse the audio transcript before broadcasting it's content back to the
client and the recording actor. Limiting by 8 words per line and max of
2 lines to avoid CPU intensive operations over this recurring event.
Replace Calibri font family with Verdana to improve character spacing,
add relative sizing to the text content and a background padding.
Add a server-side app for the audio captions feature and record proto-events
for this data.
As it is, only behaves as a pass-through module. The idea is to include all
the business intelligence in this app.
The new local echo view doesn't block the "Join audio" button while
awaiting for getUserMedia permission to be granted/denied. That may
cause unexpected behavior when unattentive users just click "Join audio"
without granting or denying gUM.
This commit accounts for gUM resolution when deciding whether to block
the "Join audio" button. It also includes an extra "isConnecting" check
to it to avoid spam-clicking issues.