Subtitle Fixer
Posted: Last updated:I was watching a movie on my computer the other day and I had gotten the subtitles for it off the internet, I think from http://opensubtitles.com or something like that. The only problem was that they were a bit out of sync with the picture, about 2 seconds too late. Using a good media player, such as VLC you can add an offset to the subtitles every time you watch the movie but I figured I could probably whip up a small script to do it for me so I could just do it once and then have the subtitles correct every time I watched the movie. The subtitles were in the .srt format, which is basically just a very simple text file with time information and the text, a typical screen is something like
00:20:24,345 --> 00:20:25,200 Hello there. How are you?
So I created a little script, suboffset.py, which takes in the name of a subtitle file and the offset to add to the file in milliseconds, modifies the file contents and writes it back out to the same file. Alternately you can specify - as the filename and then the script will read the file from stdin and write the output to stdout so you can pipe it together to write to another file. The original script was about 10 lines, but just to make it a little more useful I made it a module so you could get access to the script from other scripts and added the stdin option and some comments. You can view the script below or download the actual script here.
#!/usr/bin/python
"""
Script to offset the time in subtitle files in the .srt format. Script
takes in the filename and the offset (in milliseconds) to add or
subtract from the subtitles. It then writes the new subtitles to the
same file. Alternately you can specify the filename as '-' and then
the script will read input from stdin and write output to stdout.
"""
__version__ = '1.0'
__author__ = 'Einar Egilsson'
__date__ = 'March 20th 2008'
__url__ = 'http://einaregilsson.com/subtitle-fixer/'
import sys, re, datetime
MILLISECOND = 1
SECOND = 1000 * MILLISECOND
MINUTE = 60 * SECOND
HOUR = 60 * MINUTE
def offset_time(time, offset):
""" Takes in list with [hour, minute, second, millisecond] and returns
it with offset milliseconds added and normalized
"""
ms = sum(map(int.__mul__, time, [HOUR, MINUTE, SECOND, MILLISECOND]))
ms += offset
return [ms / HOUR, ms % HOUR / MINUTE, ms % MINUTE / SECOND, ms % SECOND]
def fix_subtitles(lines, offset, output):
""" Takes in list (lines) with all the lines from the subtitle file, adds
offset milliseconds to it and writes the file to output.
"""
for line in lines:
pattern = r'(\d\d):(\d\d):(\d\d),(\d\d\d) --> (\d\d):(\d\d):(\d\d),(\d\d\d)'
match = re.match(pattern, line)
if match:
nrs = [int(nr) for nr in match.groups(0)]
start = offset_time(nrs[:4], offset)
end = offset_time(nrs[4:], offset)
output.write('%02d:%02d:%02d,%03d' % tuple(start))
output.write(' --> ')
output.write('%02d:%02d:%02d,%03d\n' % tuple(end))
else:
output.write(line)
def print_header():
print 'Subtitle Fixer v%s' % __version__
print 'Author: %s' % __author__
print __url__
print ''
if __name__ == '__main__':
if len(sys.argv) != 3:
print_header()
print 'Usage: suboffset.py <filename> <offset-in-milliseconds>'
print 'Use - for filename to read from stdin and print to stdout'
sys.exit(1)
offset = int(sys.argv[2])
file = None
if sys.argv[1] == '-': #Read from stdin and print to stdout
fix_subtitles(sys.stdin.readlines(), offset, sys.stdout)
else: #read from file and write to same file
file = open(sys.argv[1], 'r')
lines = file.readlines()
file.close()
file = open(sys.argv[1], 'w')
fix_subtitles(lines, offset, file)
file.close()
print 'Finished adding %s milliseconds to %s' % (offset, sys.argv[1])