Einar Egilsson

Subtitle Fixer

Posted: Last updated:

I was watching a movie on my computer the other day and I had gotten the subtitles for it off the internet, I think from http://opensubtitles.com or something like that. The only problem was that they were a bit out of sync with the picture, about 2 seconds too late. Using a good media player, such as VLC you can add an offset to the subtitles every time you watch the movie but I figured I could probably whip up a small script to do it for me so I could just do it once and then have the subtitles correct every time I watched the movie. The subtitles were in the .srt format, which is basically just a very simple text file with time information and the text, a typical screen is something like

00:20:24,345 --> 00:20:25,200
Hello there.
How are you?

So I created a little script, suboffset.py, which takes in the name of a subtitle file and the offset to add to the file in milliseconds, modifies the file contents and writes it back out to the same file. Alternately you can specify - as the filename and then the script will read the file from stdin and write the output to stdout so you can pipe it together to write to another file. The original script was about 10 lines, but just to make it a little more useful I made it a module so you could get access to the script from other scripts and added the stdin option and some comments. You can view the script below or download the actual script here.

#!/usr/bin/python """ Script to offset the time in subtitle files in the .srt format. Script takes in the filename and the offset (in milliseconds) to add or subtract from the subtitles. It then writes the new subtitles to the same file. Alternately you can specify the filename as '-' and then the script will read input from stdin and write output to stdout. """ __version__ = '1.0' __author__ = 'Einar Egilsson' __date__ = 'March 20th 2008' __url__ = 'http://einaregilsson.com/subtitle-fixer/' import sys, re, datetime MILLISECOND = 1 SECOND = 1000 * MILLISECOND MINUTE = 60 * SECOND HOUR = 60 * MINUTE def offset_time(time, offset): """ Takes in list with [hour, minute, second, millisecond] and returns it with offset milliseconds added and normalized """ ms = sum(map(int.__mul__, time, [HOUR, MINUTE, SECOND, MILLISECOND])) ms += offset return [ms / HOUR, ms % HOUR / MINUTE, ms % MINUTE / SECOND, ms % SECOND] def fix_subtitles(lines, offset, output): """ Takes in list (lines) with all the lines from the subtitle file, adds offset milliseconds to it and writes the file to output. """ for line in lines: pattern = r'(\d\d):(\d\d):(\d\d),(\d\d\d) --> (\d\d):(\d\d):(\d\d),(\d\d\d)' match = re.match(pattern, line) if match: nrs = [int(nr) for nr in match.groups(0)] start = offset_time(nrs[:4], offset) end = offset_time(nrs[4:], offset) output.write('%02d:%02d:%02d,%03d' % tuple(start)) output.write(' --> ') output.write('%02d:%02d:%02d,%03d\n' % tuple(end)) else: output.write(line) def print_header(): print 'Subtitle Fixer v%s' % __version__ print 'Author: %s' % __author__ print __url__ print '' if __name__ == '__main__': if len(sys.argv) != 3: print_header() print 'Usage: suboffset.py <filename> <offset-in-milliseconds>' print 'Use - for filename to read from stdin and print to stdout' sys.exit(1) offset = int(sys.argv[2]) file = None if sys.argv[1] == '-': #Read from stdin and print to stdout fix_subtitles(sys.stdin.readlines(), offset, sys.stdout) else: #read from file and write to same file file = open(sys.argv[1], 'r') lines = file.readlines() file.close() file = open(sys.argv[1], 'w') fix_subtitles(lines, offset, file) file.close() print 'Finished adding %s milliseconds to %s' % (offset, sys.argv[1])

If you read this far you should probably follow me on Twitter or check out my other blog posts.