Commit Graph

24 Commits

Author SHA1 Message Date
Arthurk12
3b871e5ca2 fix(captions): "not supported" in chrome
Fixes a case where the locale selector don't show up in Chrome when using
'webspeech' provider.
And adds missing fields to the webspeech transcription messages, after the
addition of some new parameters to those messages with the open
transcription server.
2023-04-25 10:25:20 -03:00
prlanzarin
54b6578b03 fix(audio): forcefully disable stereo when using Vosk transcription
The current Vosk CC provider does not support stereo mic streams
(pending investigation as to why).

This commits makes sure stereo is forcefully disabled via SDP munging
only when transcription is active and using Vosk. Having it disabled
in the server side (FreeSWITCH) is not enough because the stereo parameter
is client mandated and replicated by FS on its answer. So we need to
make sure it's always disabled for the time being.
SFU audio does munging server side (and stereo is always off), so no changes
needed there.

The rest of the providers (except WebSpeech) need to be validated against
stereo audio as well.
This is also intended to be temporary - ideally this needs to be fixed in
mod_audio_fork/Vosk/wherever this is breaking.
2023-04-25 10:10:39 -03:00
Lucas Fialho Zawacki
fee6ff026a feat(captions): Use setUserSpeechLocale as an akka event and catch it in the transcription manager 2023-04-25 09:54:34 -03:00
Lucas Fialho Zawacki
6979432c36 feat(transcription): Server side open source transcriptions 2023-04-24 18:23:34 -03:00
Ramón Souza
d181eba1c2 replace lodash.throttle with native function 2023-03-02 10:25:08 -03:00
Ramón Souza
0a622eff32 replace lodash throttle with standalone package 2023-03-01 15:13:29 -03:00
Ramón Souza
dd710aa96f replace lodash uniq and uniqBy 2023-02-23 10:44:29 -03:00
GuiLeme
f67f530b32 [disabled-transcription] - Renamed audioCaptions to liveTranscription (for disabledFeatures) 2023-01-09 10:47:22 -03:00
GuiLeme
b4afec689e [disabled-transcription] - Created new disabledFeature audioCaptions 2022-12-16 17:04:14 -03:00
prlanzarin
6c8b097eba fix: add option to disable transcription in mobile, extend support check
Mobile endpoints are flaky with the WebSpeechAPI:
  - iOS versions that support it are borking our outbound audio when it's
    enabled
  - Android speech recognition has flaky locale detection and speech
    transcription
Additionally: the support check is not checking the WebSpeechAPI
availability properly, so older devices (eg iOS 12) are flagged as
supported even though they aren't.

This commit adds a configuration flag (public.audioCaptions.mobile) to
control transcription availability on mobile. False by default.
Also extends the setSpeechVoices support check and
hasSpeechRecognitionSupport method to prevent false positives.
2022-07-20 17:20:54 +00:00
Arthurk12
c96b53093c feat(captions): adds locale settings
Adds two new flags to the settings file which change the way the locale
flag is used:

- forceLocale: (true/false) => If true, enforces the transcription
  language to be the locale content field and jumps the language
selector
  in audio modal.
- defaultSelectLocale: (true/false) => If true, the default selected
  value in the dropdown language selector in audio modal will be defined
  by the locale content field.

In any case, if the locale flag holds an invalid value, it defaults to
disabled.
2022-07-20 17:20:53 +00:00
Arthurk12
da9adca229 fix(captions): talking indicator icon
Prevents the speech recognition from being initialized when the closed
captions feature is disabled.
2022-07-20 17:20:52 +00:00
Pedro Beschorner Marin
d553ca65cf feat(captions): use navigator language
If not set to use the default language, try to select the navigator
language as speech default locale.
2022-07-20 17:20:52 +00:00
Pedro Beschorner Marin
116c0d9a49 fix(captions): filter duplicated languages
Avoid multiple instances of the same language at the voices data.
2022-07-20 17:20:52 +00:00
Pedro Beschorner Marin
51eeb092b3 refactor(captions): configurable languages
Move the language collection to the HTML settings file. This data defines
the available languages available for the speech API.

These language tags are used to filter SpeechSynthesis' API `getVoices`
result. Tags must use BCP 47 format.

https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisVoice/lang
2022-07-20 17:20:52 +00:00
Pedro Beschorner Marin
b52c67d7a7 feat(captions): first pass on recording
Add the main server-side adapter for using the legacy closed captions
recording process with the audio captions data.
2022-07-20 17:20:52 +00:00
Pedro Beschorner Marin
fb48e61d6d feat(captions): add talking indicator feedback
Inform other users about who are the current talkers with the speech
recognition enabled.
2022-07-20 17:20:51 +00:00
Pedro Beschorner Marin
df184b542c feat(captions): add unsupported warning
Add a disclaimer for users on browsers that do not provide speech synthesis'
voices.
2022-07-20 17:20:51 +00:00
Pedro Beschorner Marin
d00909751a refactor(captions): change getVoices routine
In some cases, `getVoices` returns an empty array even if the browser's vendor
has full support for speech synthesis. Add a trigger call to initiate the
voices fetching process.

As drafted, `getVoices` can be an asynchronous call and monitoring it
depends on the support of a `voiceschanged` event. Although many of the
main vendors support voices, this event is not (yet) by Safari.

https://wicg.github.io/speech-api/#dom-speechsynthesis-getvoices
https://wicg.github.io/speech-api/#eventdef-speechsynthesis-voiceschanged
https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/voiceschanged_event
2022-07-20 17:20:50 +00:00
Pedro Beschorner Marin
d6dc66f57e feat(captions): language selector
Replace the checkbox with a selector up with 3 languages: en-US, es-ES and pt-BR.

Add setting option to enable by default with predetermined locale.
2022-07-20 17:20:50 +00:00
Pedro Beschorner Marin
5671bd7d3c fix(captions): check for voices
Avoid enable audio transcription if the browser's vendor does not provide
voices data.

This should prevent false positives for browsers such as Chromium and
Brave.
2022-07-20 17:20:49 +00:00
Pedro Beschorner Marin
75969ec93c feat(captions): audio captions app
Add a server-side app for the audio captions feature and record proto-events
for this data.

As it is, only behaves as a pass-through module. The idea is to include all
the business intelligence in this app.
2022-07-20 17:20:48 +00:00
Pedro Beschorner Marin
0bc730b3e3 refactor(captions): improve recovery
Use the user's talking state to trigger a speech API recovery after long
periods of silence.
2022-07-20 17:20:48 +00:00
Pedro Beschorner Marin
944edf2ccf feat(captions): web speech prototype
Hardcoded pt-BR prototype for closed captions generated by the browser's
WebSpeech API.
2022-07-20 17:20:48 +00:00