Tips for Scanning lots of Photos

The Golden Rule

My philosophy was that if I were scanning some pictures, I ought to be confident enough in the quality of my scan that I'd be comfortable throwing out the physical print photos once I was finished. Obviously I didn't throw them out, there'd be zero point to that, but the point still stands. Scanning is a pain and so if you're going to do it, you might as well do it right.

1. Setup your station - Do it right the first time

Taking out your scanner and setting it up, plugging it into your computer, and navigates a cheap Amazon USB C to USB A converter that often fails can make things tricky. So if you are sitting down to scan photos, I recommend a couch, a laptop, and a nice coffee table.

2. Maximize scans per minute

Run this operation in shifts. Spend 20 minutes scanning like a madman.

The Epson V600 came with software that advertised automatic image detection, meaning you could put four photos on the scanner bed, with a tiny bit of white space between them, and click "scan." In theory, you would get four individual photo files output. However, the software wasn't very accurate and the process was slow - it would take over a minute to scan just four photos. That was my biggest pain point. Why use a "feature" that requires a preview scan, and then an individual scan for each separate photo, taking 8x as long? I could just scan them manually in less time.

Without the automatic cropping mode enabled, the machine could make a single scan of 4 photos in about 12 seconds. I decided that given how boring scanning is, and the time required to get set up and actually start scanning (almost like setting up a new printer each time you want to use it), my goal was to maximize this made up metric:

  • (# of photos digitized / minute)
  • That is, the number of individual photos that were safely digitized on my computer. With an important caveat - this didn't mean perfectly cropped. But that was irrelevant to the ultimate goal. So I would fit as many photos as I could on the scanner, make a single scan, and repeat. My plan was to generate as many scan files as possible in the least amount of time, and then figure out how to automatically crop them later. Even if I couldn't figure out a fast way to crop them, my ultimate goal was still accomplished: the photos were digitized (albeit not cropped). My theory was that there HAD to be an online tool that would crop them from software side, rather than a hardware side like the physical Epson machine. But I was wrong!

    Cropping with Python OpenCV

    I studied computer science in college and had some experience with OpenCV, which is a computer vision library for Python. At the time I was on winter break and had just finished taking Machine Learning, so I was absolutely sure there was no chance I would manually crop a single photo, it had to be able to be done automatically. OpenCV lets computers "see", provided you instruct it what to look for. I eventually was able to tape together a Python script that looked for distinct shapes against a white background (the scanner bed). I'll have more details later but it's not very interesting.

    AutoCropper is Fast

    I calculated I was able to make about 3 scans a minute, allowing enough time to carefully place the images with enough separation. That came out to about 9 to 12 individual photos cropped and digitized per minute, going back to the original equation. Let's say I sat down and scanned for 30 minutes:

  • 4 photos per scan
  • 3 scans per minute
  • In 30 minutes:
  • 360 perfectly cropped separate photos
  • I still remember the first time I ran my script via the command line and saw a flood of print statements announcing an image detected and its crop, and then checking the output folder. It was awesome. In 30 minutes of scanning, I had about 90 separate scans, each with 3 to 4 photos. I'd run the 90 raw scan files through my program, and output around 300 separate photos!