Since this is going to be done by several people, some of which have no
scanner at all, we are going to have to find a common format. That
format must be a regular text file. Andy Mace will then accumulate all
the input form the volunteers and edit/format/index the data so it will
look good when it is uploaded to the internet site.
Tect is also the most conservative format from a space required
perspective. So it looks like that is what we will be converting it to,
regardless of how (scanner, OCR, Typing, etc)
Joe
Conn wrote:
>
> R.D. Waid wrote:
>
> > Joe,
> > I don't think anyone has suggested scanning the text and using an OCR
>software
> > package (optical character reader) to convert them to a word
> > processor-comaptible format or even ASCII text. Scanning to produce graphic
> > images of each page would take a staggering amount of memory and disk
>space. I
> > have used OCR (Textbridge from Xerox) software at work to convert scanned
> > (graphical) text to editable ASCII or MSWord documents with pretty good
> > results-the material still needs to be proofread to catch errors from
>scanning
> > glitches, but at least the file size would be manageable. Pictures, etc.
>would
> > have to be scanned and attached as .gifs or .jpegs to take up minimum space.
>
> The latest version of Textbridge (and Omnipage too, I think) will not only
>scan
> the words, but the photos, and reassemble the whole mess into a MSWord file
>that
> is a very close replica of the original doccument, right down to columns, font
> sizes and styles.
>
> This saves a ton of steps, and creates a very faithfull reproduction in one
>easy
> step.
>
> -- Conn (conn@wctc.net)
>
> History shows again and again,
> how nature points out the folly of man.
--
"If you can't excel with talent, triumph with effort."
-- Dave Weinbaum in National Enquirer
|