 joakoPremium join:2000-09-07 /dev/null kudos:5 Reviews:
·Comcast
| OCR Software that works? I have a very simple requirement: watch a source folder with .tiff and .pdf (images) files, run OCR and save it as a searchable PDF w/ the original image in another folder.
So far I have used:
WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention OmniPage 17.0: Does not process 200 files without intervention. Constantly hangs and requires restarting the application OmniPage 17.1: Same as 17.0 OmniPage 18.0: Not sure I can recommend spending $200 for an upgrade
I would use Adobe Acrobat but 1) It can't "watch" a folder 2) the EULA forbids this sort of use. I called Adobe sales and they do have a product that does this: LiveCycle PDF which "starts at $40,000"
Any suggestion? -- PRescott7-2097 |
|
 | Is this for an office? The reason I ask is because we have our copiers are from Toshiba and they have a piece of software, ReRite, that does almost exactly what you want. Your copier manufacturer may have something similar.
I think ReRite is not developed by Toshiba but by RedMap, an Australian company, so it might be possible to buy it without the copier. |
|
 techjoePremium join:2004-02-20 Warrenville, IL kudos:1 | reply to joako Two to check out: BlueBeam and Nuance -- Baka wa shinanakya naoranai |
|
 | reply to joako I installed the demo of Ademero PDF capture software last week, does exactly what you want it to do. I haven't had time to really play with it much, but did setup a few quick capture folders and tested the conversion to searchable PDF and it's indexing. »www.ademero.com/products/content-central/ |
|
 joakoPremium join:2000-09-07 /dev/null kudos:5 Reviews:
·Comcast
| reply to mikkopel said by mikkopel:Is this for an office? The reason I ask is because we have our copiers are from Toshiba and they have a piece of software, ReRite, that does almost exactly what you want. Your copier manufacturer may have something similar.
I think ReRite is not developed by Toshiba but by RedMap, an Australian company, so it might be possible to buy it without the copier. We're using Lexmark machines, but it seems what they have to offer isn't included for free, and more than likely is in Adobe LiveCycle price range: »www1.lexmark.com/en_US/solutions···on.shtml -- PRescott7-2097 |
|
 | Ours wasn't free either, and I forget the actual price, but it was nowhere near $40,000 mark. I think the list price was about $1000. |
|
 jp10558Premium join:2005-06-24 Willseyville, NY | reply to joako Well, $200 to see if 18.0 works better is a much better starting place than $40,000 to see if Adobe works better IMO... |
|
 tekmunkiTekmunkiPremium join:2001-12-06 Lake City, FL | reply to joako said by joako  WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention
I'm about to add a WatchOCR box as a scanning destination for a shared copier- What 'intervention' did it need? -- TekMunki
"There are 10 types of people in this world, those who understand binary and those who don't."
www.tekmunki.com |
|
 joakoPremium join:2000-09-07 /dev/null kudos:5 Reviews:
·Comcast
| said by tekmunki:said by joako  WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention I'm about to add a WatchOCR box as a scanning destination for a shared copier- What 'intervention' did it need? First WatchOCR compared with ABBYY or OmniPage WatchOCR looses hands down. WatchOCR I would have many unrecognizable words. The others have near 100% recognition rate. Doing searches for common terms in my document management system, when using WatchOCR the number of hits was lower.
Second, rotation didn't seem to work properly. Maybe this is related to the hanging processes? I would see many messages about detected orientation but the output files weren't recongized.
WatchOCR has a setting for max processes, I had it set to 4. After a while when issuing the command ps aux I would see process that had been running for over 12 hours and low CPU use. I would have to kill these processes for the OCR process to continue.
I installed WatchOCR on Ubuntu 10.10, not using the LiveCD. Maybe the LiveCD doesn't have these issues, but I needed FTP server support.
ABBYY FineReader Corporate is $600 on their website, but you can buy the same SKU for $270: »www.provantage.com/abbyy-softwar···Y038.htm I'm using the trial right now, you can call their sales dept. and request a 15 day trial. -- PRescott7-2097 |
|