This incorporates only the audio desync related changes from #11626
* Add the aresample filter with async option to fill in timestamp gaps
* Use the libopus decoder for opus audio instead of ffmpeg's builtin
decoder
This gives the following advantages over the previous code:
* The ffmpeg input filters are loaded from a filter "script" file instead
of passed on the command line. This fixes some cases of recordings
failing to process because the ffmpeg command line generated for the
audio processing exceeded the max command line length limit. (Although
that only really happens due to BBB bugs...)
* Use absolute positions when trimming audio segments for cuts.
Previously segments were trimmed to the length of the segment, and the
results were concatenated. There's some possibility of accumulated
errors in the segment lengths causing audio desync over time. The new
code incrementally concatenates the segments, and cuts each segment
end based on the absolute time since the start of the meeting, to
avoid error accumulation.
On my server 2.3 alpha, the method metadata_for(meeting_id) gives back {}
(empty Hash). Thus "return if meeting_metadata.nil?" does not occur.
Does @redis.hgetall give {} instead of nil, even though there is a comment in
node_modules/redis/lib/utils.js "hgetall converts its replies to an Object. If
the reply is empty, null is returned"???
Update the list of invalid characters based on what the XML
specification permits and discourages.
Use the ruby string `scrub` method to remove invalid characters that
can't be expressed in the `tr` syntax, like unpaired surrogates and
UTF-8 prefix bytes.
I've moved the workers code into the `lib` subdirectory with other library-ish
code; this puts it into the ruby load path used by most scripts so referencing
files is easier.
I've applied various style cleanups based on the rubocop config present.
The `events` processing step has been integrated as a new worker `EventsWorker`,
there is no longer a separate `events/events.rb` script. I've reworked the
`rap-starter.rb` script to check for the done files in both the events and
recorded status directories.
Found another case where the html5 client was passing through control
characters, in the original presentation name field.
Rather than play whack-a-mole with different fields which may eventually
get poorly sanitized user data, apply the control character filtering
to all properties.
Adjust the character range to do the following:
* Allow horizontal tab (0x09), it's not problematic.
* Disallow control characters in the range 0x1A-0x1F. Probably missed by accident.
FFmpeg has pretty good format autodetection even if the filename has the
'.txt' extension, so just rely on that. It'll even pull subtitles out of
video files - although I expect we'll have size limits so that doesn't
happen.
Sometimes when text is pasted into the whiteboard text tool from
external apps, it'll include control characters that mess up later
recording processing scripts.
Run the same cleanup as already used for chat messages on the whiteboard
text as well.
The cleanup has been adjusted to allow newline and tab characters. They
won't really cause issues in chat, and newlines (at a minimum) are
required for the whiteboard.
This is a workaround for #7356
BigBlueButton can sometimes write events out of order - this particularly
seems to affect the final RecordStatusEvent in a meeting which was ended
while recording was still running. This breaks the recording processing
scripts.
As a workaround, sort the events as they're being written into the events.xml
file. We have the following properties:
* The input data is already mostly sorted
* Items in the wrong position will be no more than a couple spots off from where
they should be
* We should not change the relative order of events with the same timestamp.
The best algorithm to use here is a simple insertion sort. When adding each new
event to the XML structure, it scans backwards through existing events until it
finds the correct position.
For #6035
At some point, BigBlueButton switched from using `"` to `'` in the link tags in
chat messages, which caused the regular expression that was supposed to remove
the `event:` prefix to not match.
I've replaced the error-prone regular expression with an actual HTML parser,
using the "Loofah" HTML transformation/sanitization library based on Nokogiri.
I've removed the code that detected unlinked URLs, since it was broken - and
not needed: current BigBlueButton versions do the link detection in the client.
If someone reprocesses a really old BBB recording with these scripts, URLs in
chat might not be linked in the result. But they wouldn't have been linked in the
client in the original meeting either, so I figure that's ok.
Fixes#6475
In some cases, ffmpeg will be able to read the file, but the video itself
can't be decoded (missing/corrupt stream headers, for example). In this case,
some of the properties on the stream object will be nil.
Make sure that pix_fmt is present in the probed info, since that's a required
property.
Issue #6338
It looks like there was a logic error in the code that was causing it
to break out of the event deletion loop early when deleting events for
the last (or only) segment in a recording. (In this case, last_index
is -1, so i >= last_index is always true).
The trim_events_for call was always succeeding, so the events were
being removed from the event list (meeting:{ID}:recordings key) even
though the events themselves hadn't been deleted in the loop.
I've moved the trim_events_for call to below the event deletion loop
to ensure that if the archive script is interrupted, the events list
will contain all not-yet-deleted events.