Thanks for being a part of WWDC25!

How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here

SpeechTranscriber time indexes - detect pauses?

I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing!

I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause.

I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough.

Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber?

Here's a short sample, showing each run with the start, end, and characters in the run:

105.9 --> 107.04  I
107.04 --> 107.16  think
107.16 --> 108.0  more
108.0 --> 108.42  lighting
108.42 --> 108.6  is
108.6 --> 108.72  definitely
108.72 --> 109.2  needed,
109.2 --> 109.92  downtown.
109.98 --> 110.4  My
110.4 --> 110.52  only
110.52 --> 110.7  question
110.7 --> 111.06  is,
111.06 --> 111.48  poll
111.48 --> 111.78  five,
111.78 --> 111.84  that
111.84 --> 112.08  you're
112.08 --> 112.38  increasing
112.38 --> 112.5  the
112.5 --> 113.34  50,000?
113.4 --> 113.58  Where
113.58 --> 113.88  exactly
SpeechTranscriber time indexes - detect pauses?
 
 
Q