If you're inserting at position 0 (and there was no previous deleted text
from that position), you can't use the timestamp from the previous character
position, since there's no previous character. Use the timestamp of the
following character instead.
Current BBB client is generating invalid indexes when characters are
inserted with an IME - the edit indexes are as if the preedit text was
being removed, but the preedit text was never sent in the first place.
For now, just don't crash if there's an edit that would remove text
which is past the end of known text. The result might be broken, but
it won't prevent the rest of the recording from working.
I've had a chance to see how this script behaves with actual live
caption tracks now, and there's room for improvement. In particular,
it often generates cues that overlap - the next one appears before the
previous one disappears. The browser player position handles this
really poorly, and it's nearly unreadable.
The solution is to, if two cues would overlap, merge them into a
single multiline cue that displays for the full time. This is a lot
easier to read.
Some extra code is added to de-overlap any remaining cues (e.g. if
there's a third cue that would also overlap). This will reduce the
time that the earlier cue gets shown below my preferred minimum, but
not really much we can do about that if people are talking/typing
quickly.
The code can easily be tweaked to set a different number of maximum
lines per cue if desired.
When the next line break after a wrap induced due to line length was a
hard line break, the hard line break would override the soft line break,
resulting in over-long lines, and the last word would be repeated on the
last line.
Move the hard break code into an else branch so it's only applied if we
haven't already done a word wrap on this line.