Project Description
Converts subtitles from DVDs and PGS (Bluray .sup) files into Advanced Substation Alpha and SRT text format using OCR (optical character recognition).

I've created this app to extract subs from (non-encrypted, on hard drive) DVDs and convert to Advanced Substation Alpha format. It can also convert sup (PGS) and sub/idx formats to same. I wrote this because I hate the blocky, too-high-on-the-screen look of regular DVD subtitles and wanted to re-encode my DVD collection in h264/aac/assa in an mkv container.

It's a wizard-style app, allowing you to pick program chains, angles, audio and subtitle tracks from a DVD folder and create mpg, d2v and bin (my own data format similar to sub/idx combined) files for each. DGIndex is used to help line up the subs to the video since DVD programs often have discontinuities that mess up sync.

The OCR is pretty basic, exact pattern matching of the characters for DVDs, some fuzzy logic added for High Definition subtitles. The starting OCR database is pretty good though so most DVDs should require manual matching of just a few characters. Some characters like l, I, 1, o must be manually matched for every DVD since they have a lot of false positives.

The line and word layout functions are pretty sophisticated and should give good results unless the characters are very unusual (vertical or upside-down text is bad).

Last edited Dec 9, 2011 at 6:47 PM by crmeadowcroft, version 4