![]() ![]() Here is another one that passed the OCR, but is incorrect (depending on the library used you could even apply a border/stroke to the outside of the letters) Here is one that passed the OCR, but is incorrect: See how easy it is to see that they match: I think for somebody that visually checks there OCR for their subs, this would probably speed up the process for them 200%+ versus reading two or three times, and moving your eye between locations, and also having to remember and hope you remember correctly.) (basically you read the sub line ONLY once, and your brain looks for discrepancies as you do it. The letters dont have to line up perfectly, anywhere close will allow you to quickly with just a glance tell if the sub and text match visually. ![]() And size the text to roughly overlay the SUB image with like a 50-60% transparency. Now, that is not exactly quick, the brain has to think more, it has to remember more, and your eyes have to move and focus on more than one area, below is my idea:īasically, use an opengl or directx library that can overlay text, or any library that looks like it will work to overlay text with transparency. then use the arrow key to go down line by line, reading the text, and then looking at the image to compare and see that they are the same. I am drawing an illustration in Photoshop now.ĮDIT: ok to illustrate my idea. Please let me know what you think because i think it would be AWESOME! Let me know what you think of this idea, I am sure it would actually be something that would be pretty fun to program. The goal should always be perfect OCR on the first sweep, but visually checking the subs afterwards is just to verify, and the quicker you can do that the better. (OCR can only get so good, and if you want to verify perfect subs, this is a good way to do it.) OMG OMG OMG! The programmer in me has just thought of a VERY COOL feature you could add!Ĭall it a visual tool for super fast comparison. The uppercase problem even doesnot seem repairable by spell checker processing! PPlease give me some suggestions to make functional at least one of the methods, so that most words are recognized properly and don't need to correct by spell checker. All of s, z, c and a's are kept lowercase if in middle a word. All occurences of these letters seem uppercased regardless on case in the original matrix if they stand as standalone letter or 1st letter in word. Some characters are auto uppercased even if they are in lowercase in the source matrix, especially it concerns 's', 'z', 'c' and 'a'. Tesseract seems to work better but has considerable flaws too: That's about character comparison method. I don't know if that's a result of some auto corrections made by SE, but seems to get wrong assigned even if I turn off all the auto corrections on the right side. When it passes over e, it doesnot ask again for letter even if that s 1st "e" in subtitles and assigns it automatically 'o'. 1st subtitle contains word more, the wizard stops at o and I assign it o. All the letters are assigned the character that was assigned by the first occurence of on of letters from "same" group. In the pattern comparison mode, the engine totally ignores differencies between letters 'i' and 'l', and 'c' and 'o' and 'e'. Recognizing from SUP format, tried both methods and both have significant inaccuracies: Like the word worried, gets detected as worrieol I have also had "d" been detected as "ol" pretty often, then the spell checker dont recognize the word so I edit it manually and change the "ol" to "d" Is there a way I can add "l<" to be autocorrected to "k" ?Īlso a setting in the options panel to disable "Try MS MODI OCR for unknown words" by default would be handy, then I wouldn't have to uncheck it every subtitle I load The only OCR error that I get that does not get automatically corrected is the letter "k" being detected as "l "My" ![]() (Upon further testing this weird error only happens if the "Try MS MODI OCR for unknown words" checkbox is checked, If I un-check it then this strange substitution does not happen.) SUP that caused this error to occur give me an email address I can send the file to. I have a rather strange auto fix though (some kind of error or bug): This actually works really good! almost all of the text is right on, and the GUI guides your through smoothly when it needs a fix.
0 Comments
Leave a Reply. |