Scanner recommendation for decent scans

Scanner recommendation for decent scans

Post by lostinspac » Fri, 22 Jul 2005 23:17:26


---- Original Message -----
From: <>
Newsgroups: comp.periphs.scanners
Sent: Thursday, July 21, 2005 12:05 AM
Subject: REQ: Scanner recommendation for decent scans



Andy,

This inquiry might be better served in:
comp.doc.management
or
comp.ai.doc-analysis.ocr

I do both daily and extensive (for over five years) scanning with a bottom
line scanner ($50 Canon) utilizing Omnipage 9.0. Many folks suggest that 9.0
is terrible quality, however I'm most pleased in the results as compared to
my previous scanner. (My machine a Althon 2.0 with 512k, USB 2.0 none of
this increases the actual scan speed, only the scanner may do that).

The first place I suggest you start is in cleaning both sides of the scanner
glass. It's a careful and tedious process to removed all the streaks,
smudges and accumulated plastic. I often repeat the cleaning 3-4 time in
each cleaning session for quality. (Caution; no paper towels, no windex)

Any DPI less than 300 will not offer good quality for OCR.

If your working with either yellowed or aged documents, color is preferable
over black and white.
You'll be surpised how much improvement this will make.

Small fonts and fractions will be a problem that you'll never resolve.
Flatbed scanners just don't have enough depth-of-field or magnification
options for small fonts. Many of the docs that I work with utilize fifths of
seconds repeatedly and that rarely scan correctly.

The OCR corrections are best made in the OCR software rather than a spell
checker. They do offer a "change all".

As far as multi-column newspaper OCR?
Your best results will be in using a quality copy machine to increase the
page sizes.
I've had good results doing so with some 100+YO newspapers (four column 11 x
17) and the machines at Kinko's.

In summary, I believe your looking for an automated solution that just
doesn't exist.
You either accept crappy results and move on or do the manual editing to
assure the desired goal, there doesn't seem to be any in-between.



 
 
 

Scanner recommendation for decent scans

Post by CSM1 » Fri, 22 Jul 2005 23:18:04


The only suggestion I have is:
If you are happy with the two scans per page, then there are newer and
faster A4 and Letter size scanners for about $100. The OCR solution is, use
Omnipage 14 or Abbyy Fine Reader for much better OCR results.

The best price for Omnipage 14 is found at:
http://www.yqcomputer.com/


Abbyy FineReader 7.0 Professional Upgrade for any OCR software you currently
own:
http://www.yqcomputer.com/

--
CSM1
http://www.yqcomputer.com/
--

 
 
 

Scanner recommendation for decent scans

Post by and » Sat, 23 Jul 2005 13:08:22

did some more experimenting and found out that:

1. Scanning at 600dpi black and white with a higher threshold (so that
more black dots are produced) yields a much much higher OCR accuracy.
OCR accuracy may be 99% or more.

2. I created a special OCR training bitmap file with the special
single character Fractions ('1/4', '1/2') far apart so that there is a
lot of white space around each character. Using a large font such as
24 point and repeatedly OCR'ing each character allowed me to train the
OCR software to recognize those fractions.

The extra 4 or 5 minutes per scan actually saves grunt work in
correctng OCR problems and text ordering (i.e., layout was OCR'ed
incorrectly) problems.

I do like the idea of photo-copying using this workflow:

1. photocopy each 10' by 12' page, adjust the contrast and outuput on
8.5 by 11 inch pagper

2. Scan at 600 dpi or higher (this is a single scan which saves time
trying which was spent trying to join two scans together)

3. OCR

4. Correct, reformat text, etc.

Advantages:

1. handling the odd sized, aged paper journals is easier since each
page is 'scanned' by the copier 1 time instead of 2 scans with a 8.5
by 11 scanner.

2. Copier can do a better job of contrast, threshold, etc than the
scanner.

3. Copies could be sheet fed into scanner (requires me byying a new
scanner)

4. Full scan, OCR, output cycle could be automated with sheet feeder.




On Thu, 21 Jul 2005 14:17:26 GMT, "lostinspace"
< XXXX@XXXXX.COM > wrote:


 
 
 

Scanner recommendation for decent scans

Post by Don » Sat, 23 Jul 2005 19:18:32

On Thu, 21 Jul 2005 14:17:26 GMT, "lostinspace"



In my experience *dry* microfiber cloth works the best! However, not
all microfiber cloths are made the same! Many that call themselves
microfiber, ain't! A true microfiber cloth has an almost *** y
feeling to it when used on glass. It's worth while getting two so when
one is the wash the other one can be used.

Before that, I tried all sorts of liquids from various lens cleaning
liquids to *** and, yes, even Windex! Nevertheless, whatever the
liquid there always seem to be a residue. I always used lens paper
because all other paper can cause scratches.

I still managed to scratch the glass, though, but it was due to a
grain of sand which got caught in the paper. So before any cleaning
it's worth while using a blow brush to get rid of big particles first.

Finally, the best way to check if the scanner glass is clean is to
open the lid and scan "nothing" in a darkened room. As the light
passes under the empty glass look at it at a very shallow angle.
Actually, I get down so my eyes are parallel with the glass. It's
amazing what can be seen like that.

After the scan is done, examine it under maximum magnifications (e.g.
in Photoshop) after increasing the brightness until black turns to
light gray. Any glass imperfections, scratches, debris or streaks will
just jump at you!

For compulsive scanner cleaners it's truly a horror to look at! ;o)

Don.