Post by Jonathan BerryMuch truth, Vera, thank you.
I went ahead and bought the Fujitsu ScanSnap S510. So far I am mostly
pleased
with the performance of the scanner and of the accompanying software
Adobe Acrobat 8.0 (which I have updated to 8.1.1) Standard. I'm
amazed at
what you can do with Acrobat Standard, and perhaps I have yet only
scratched
the surface.
I am a bit more iffy about the crippled Abbyy which is included in the
package.
It does a pretty good job OCRing typed material, and it is much better
in recognizing
page formats than I experienced in my last foray, around a decade ago,
into the world of OCR (Pagis Pro software that was so nasty that I
uninstalled it within half an hour and resolved never again to buy one
of their products). However, Abbyy still makes too many errors in
recognizing the text of old newspaper columns. The RTF file which it
produces is difficult to edit (the print is too small for the screen
and it takes too much tweaking to get readable) and has lots of little
boxes that interfere with text flow. If you have hundreds of pages to
edit, you want to be able to get in quickly and do the necessary
corrections. The Abbyy Scan2Excel was unable to make sense of phone
bills. In many pages of bills, it was able to put the figures in
columns only three times, and on each occasion it did that job
differently. These were all in the same file and bills from the same
telephone company.
There are (almost) no user settings to adjust, and no real-time
controls.
Admittedly, OCR is a more difficult task than the jobs which Acrobat
is called upon to do. And the Abbyy software with the ScanSnap is
based upon FineReader 7, while the current version is 9. The full
version 9 alone would cost more than the scanner with software.
Still, I don't have to be pleased with every bargain.
The bottom line is that I end up converting most documents to text-
searchable (it takes some moments of CPU time, but it makes the
resulting files smaller) PDF files using Acrobat, and rarely do I use
the full Abbyy OCR.
--
Jonathan Berry
Post by s***@gmail.comPost by Jonathan BerryI'm considering buying the Fujitsu ScanSnap S510 which has a, er,
special version of abbyy, plus Adobe 8.
Reviews of this machine are very good, but I also have a fair number
of non-sheet things to scan to OCR. I figure I'll use my flat bed
scanner, but then will the "special" version of Abbyy OCR from those
jpg's, or is it "hard-wired" to only work with the ScanSnap?
Does Abbyy have any programmability? I do some of scanning of chess
notation, which can be constrained by straightforward Regular
Expressions. Other OCRs I tried, wanted to make words out of things
like Qe5 Na2 and so on, even with the dictionary turned off.
Any other comments are welcome. TIA.
--
Jonathan Berry
Hi Jonathan,
ScanSnap doesn't use common TWAIN drivers, as normal flatbed scanners
do. It rather uses it' own unique software to scan. That is why ABBYY
FineReader for ScanSnap is "hard-wired" to ScanSnap only, and it
cannot accept images from other scanners.
As to chess notations scanning, you should look at ABBYY FineReader
9.0. It allows you to define user languages, based on regular
expressions. You can download a trial version from ABBYY site and see
if it works for your images:http://finereader.abbyy.com/
The user languages functionality is available only in the retail
version of ABBYY FineReader. The version for ScanSnap doesn't include
it.
Vera
Yes, it is well-known that for special fonts, degraded text material,
and handprint, such bundled software is not a good fit.
If you have anything more than 100 pages a day to scan, and you want
to do special things with those that enhance your productivity, you
need applications that go beyond what such "bundled" apps can provide,
and that is not their focus too... the way it works for scanner
manufacturers - they want to give as many people as possible a reason
to buy their product. So they strike special deals with companies like
ABBYY to "bundle" scaled-down/crippled versions of software to give
away with their scanners. For the software manufacturer, this is brand
awareness and the fact that you have tried out their technology and
(hopefully) been impressed.
Beyond a certain point, even the fujitsu scansnap begins to hamper
productivity... can you believe that there are document scanners that
cost hundreds of thousands of dollars and even millions of dollars?
Right, there's a range for every (mostly) imaginable need.
ABBYY, for example, makes their "engine" available to developers, and
they also have higher-end retail products labelled like "professional/
enterprise".
Many other manufacturers do similar things, depending on which model
of scanner you buy.
That is how ABBYY and other OCR/ICR manufacturers earn money... :-)
Best Regards,
Milind Joshi
IDEA TECHNOSOFT INC.
http://www.ideatechnosoft.com/ocr_icr.html