“Imagine a banana. Or anything curved. Actually, don’t, cause it’s not curved or like a banana. Forget the banana!”
The Doctor, describing conceptual space, in “Doctor Who: Space and Time”
I’ve repeatedly written posts about the joy to read books digitally and how to quickly scan paper books. Given that I spend the last weekend at home and took care of the last books I had stored in my old room, I’d like to give a short update.
The books I scanned varied in size, color vs. b/w, amount of pages, etc. When I scan the books, I usually scan color books in color and with the best quality. In rare cases, I use gray (if it is really a b/w book with a lot of pictures), and for most paperbacks/novels I use b/w (except for the cover).
These are the settings for my ScanSnap S1500M
(no compression setting)
However, what makes the scans worthwhile happens after the scan.
Using Acrobat (still have Adobe Acrobat 9 Pro) I first use Reduce File Size (with retain existing, but saved as new file) and then OCR.
Saving the documents as new files lets you easily keep the original scans in best quality — in case you ever need them. The effect of reducing the file size is incredible, just look at the original sizes, after reduced file size, and after OCR (OCR adds a little).
(61 files, from 12.71 GB to 1.2 GB to 1.6 GB)
After OCR is done (keep in mind to select the correct language) you can read the books is “well-enough” quality and easily copy and paste interesting passages. A few books (about 1 of 20) has some OCR problems, and perhaps 1 in 50 is unusable for OCR.
But most of the time it works brilliantly.
I still have my books without them taking up any (relevant) space.
Happy reading 🙂