TV Transcript Shared Segment Extractor

Paste two long transcripts (e.g. 30–60 minute TV shows) and extract the chunks of text that appear in both. The tool identifies matching word runs, chains them together across small transcription-noise gaps, and reports the clean shared segments. It then recursively searches for additional shared segments in the remaining text — so it can find multiple duplicated chunks even when they appear in different orders across the two transcripts.

Transcript 1

Transcript 2

Max gap (words)

Merge two matching runs into one segment if the non-matching gap between them is at most this many words (on either side).

Min segment length (words)

Drop merged segments shorter than this (measured as the average of the two sides).

Min purity

Matched words ÷ total segment words. Keeps segments where at least this fraction is a true match.

Show full transcripts with segments highlighted