site Search:


 
    All Forums Hot Topics Gallery






how-to block ads


 
Search Topic:
Uniqs:
1374
Share Topic
Posting?
Post a:
Post a:
AuthorAll Replies


joako
Premium
join:2000-09-07
/dev/null
kudos:5
Reviews:
·Comcast

OCR Software that works?

I have a very simple requirement: watch a source folder with .tiff and .pdf (images) files, run OCR and save it as a searchable PDF w/ the original image in another folder.

So far I have used:

WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention
OmniPage 17.0: Does not process 200 files without intervention. Constantly hangs and requires restarting the application
OmniPage 17.1: Same as 17.0
OmniPage 18.0: Not sure I can recommend spending $200 for an upgrade

I would use Adobe Acrobat but 1) It can't "watch" a folder 2) the EULA forbids this sort of use. I called Adobe sales and they do have a product that does this: LiveCycle PDF which "starts at $40,000"

Any suggestion?
--
PRescott7-2097

mikkopel

join:2002-10-25
Haverstraw, NY

Is this for an office? The reason I ask is because we have our copiers are from Toshiba and they have a piece of software, ReRite, that does almost exactly what you want. Your copier manufacturer may have something similar.

I think ReRite is not developed by Toshiba but by RedMap, an Australian company, so it might be possible to buy it without the copier.



techjoe
Premium
join:2004-02-20
Warrenville, IL
kudos:1

reply to joako
Two to check out: BlueBeam and Nuance
--
Baka wa shinanakya naoranai


mark1239

join:2000-07-13
Brandon, FL

reply to joako
I installed the demo of Ademero PDF capture software last week, does exactly what you want it to do. I haven't had time to really play with it much, but did setup a few quick capture folders and tested the conversion to searchable PDF and it's indexing.
»www.ademero.com/products/content-central/



joako
Premium
join:2000-09-07
/dev/null
kudos:5
Reviews:
·Comcast

reply to mikkopel

said by mikkopel:

Is this for an office? The reason I ask is because we have our copiers are from Toshiba and they have a piece of software, ReRite, that does almost exactly what you want. Your copier manufacturer may have something similar.

I think ReRite is not developed by Toshiba but by RedMap, an Australian company, so it might be possible to buy it without the copier.

We're using Lexmark machines, but it seems what they have to offer isn't included for free, and more than likely is in Adobe LiveCycle price range: »www1.lexmark.com/en_US/solutions···on.shtml
--
PRescott7-2097

mikkopel

join:2002-10-25
Haverstraw, NY

said by joako:

We're using Lexmark machines, but it seems what they have to offer isn't included for free, and more than likely is in Adobe LiveCycle price range: »www1.lexmark.com/en_US/solutions···on.shtml

Ours wasn't free either, and I forget the actual price, but it was nowhere near $40,000 mark. I think the list price was about $1000.

jp10558
Premium
join:2005-06-24
Willseyville, NY

reply to joako
Well, $200 to see if 18.0 works better is a much better starting place than $40,000 to see if Adobe works better IMO...



tekmunki
Tekmunki
Premium
join:2001-12-06
Lake City, FL

reply to joako

said by joako See Profile
WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention



I'm about to add a WatchOCR box as a scanning destination for a shared copier- What 'intervention' did it need?
--
TekMunki

"There are 10 types of people in this world, those who understand binary and those who don't."



www.tekmunki.com


joako
Premium
join:2000-09-07
/dev/null
kudos:5
Reviews:
·Comcast

said by tekmunki:

said by joako See Profile
WatchOCR: A series of Linux scripts. Does not process 2,000 files without intervention



I'm about to add a WatchOCR box as a scanning destination for a shared copier- What 'intervention' did it need?

First WatchOCR compared with ABBYY or OmniPage WatchOCR looses hands down. WatchOCR I would have many unrecognizable words. The others have near 100% recognition rate. Doing searches for common terms in my document management system, when using WatchOCR the number of hits was lower.

Second, rotation didn't seem to work properly. Maybe this is related to the hanging processes? I would see many messages about detected orientation but the output files weren't recongized.

WatchOCR has a setting for max processes, I had it set to 4. After a while when issuing the command ps aux I would see process that had been running for over 12 hours and low CPU use. I would have to kill these processes for the OCR process to continue.

I installed WatchOCR on Ubuntu 10.10, not using the LiveCD. Maybe the LiveCD doesn't have these issues, but I needed FTP server support.

ABBYY FineReader Corporate is $600 on their website, but you can buy the same SKU for $270: »www.provantage.com/abbyy-softwar···Y038.htm I'm using the trial right now, you can call their sales dept. and request a 15 day trial.
--
PRescott7-2097

Saturday, 25-May 14:27:33 Terms of Use & Privacy | feedback | contact | Hosting by nac.net - DSL,Hosting & Co-lo
over 13.5 years online © 1999-2013 dslreports.com.
Most commented news this week
Hot Topics