Finally! It took a lot longer to edit and convert 420 TIFFs than I thought it would. You know how it is – life gets in the way, your editing software gets cranky and refuses to acknowledge certain files exist. The usual. I’m hopeful that future issues won’t take nearly as long. According to Margot Comstock, this was their largest issue and, “the one that did us in.”
Here’s the first of my Softalk scans. At 300 dpi, using Adobe’s ClearScan option in Acrobat for OCR, the final file size is 222,789,632 bytes. And the quality? Really not so bad – much better than I thought it would be, and at a significant savings in file size. Here’s how it breaks down:
The original uncompressed, raw TIFFs at 600 dpi in color : 41,269,611,740 bytes. Yes, that’s 41 GB! Fills up the hard drive in a hurry…
Adobe Searchable Image OCR at 600 dpi: 1,473,820,425 bytes (1.47 GB)
Adobe ClearScan OCR at 600 dpi: 890,141,883 bytes (890 MB)
Adobe Searchable Image OCR at 300 dpi: 413,442,993 bytes (413 MB)
Adobe ClearScan OCR at 300 dpi: 222,789,632 bytes (223 MB)
Acrobat also offers an option called “Searchable Image (Exact)” which doesn’t do any compression at all and is therefore not really appropriate for this project.
If I could do compression percentages in my head, I’d be able to tell you that’s … well, a bunch of percentages anyway. Here’s the 223 MB version. I’m going to let you guys decide. Is this version acceptable to keep the size down (relatively speaking of course)? Should I offer both high and low resolution versions? Have your say in the comments, or drop me an email.

