The homebuild automatic micro fiche scanner
Why scan micro fiches?
When working with vintage computer systems, you absolutely need lot of ancient documentation. Luckily people are scanning these old docs like crazy, the big archive site "bitsavers.org" is well filled ... especially with DEC stuff.
DEC distributed documentation primary on paper. Even most of the micro fiche documents are print-outs which have been transferred to film. So if you find a micro fiche, its content is likely already digitized somewhere. But there are exceptions.
When you repair PDP-11's you need to run the "XXDP" diagnostics programs. These program come with almost no user interface, and error printout is cryptic. Documentation is only found in the MACRO-11 assembler listings of those diagnostics. And so far as I know, these listings were distributed only as micro fiches.
After I fixed a pile of PDP-11/34's without proper XXDP documentation, I asked around for XXDP program listings. They already were in different collections around me, but only as micro fiches. I could've read them on a classic micro fiche reader, but I decided to start a digitizing project (to give something back to bitsavers.org).
I planned for a fiche volume of 400 fiches, containing about 50.000 document pages ("frames"). Scanning these proved to be impossible first:
- You can not scan fiches on a flat bed scanner. Even at 4800dpi, resolution is not sufficient.
- You can not give the fiches to a commercial scan service, costs will kill you.
- Interestingly, the nearby University of Göttingen operates public micro fiche scanners! But you have to manually adjust each single document frame on a fiche. And scanning one frame lasts almost a minute.
- You can not buy an automatic micro fiche scanner: they are very very expensive.
Because of all these difficulties, I decided to build an own scanner. Participation on the "DEC micro fiche untderground" forum showed that this would fill a gap. And I got much support from the guys of my computer club C-C-G.
My very own scanner
See it here at work:
To be usable, the micro fiche scanner should fulfill these criteria:
- image quality: scan resolution should be higher than the resolution of the film itself, else there may be loss of information
- fully automated movement of the fiche, so all document page frames can be scanned unattended. The only manual operation should be the changing of the fiches.
- The resulting document should have the format and the quality to be recognizable by OCR.
Components of the scanning rig are:
- A modified AGFA GEVAERT COPEX LP4 micro fiche reader. Not all readers show a good picture, but this one is fine.
- The screen of the Agfa reader is photographed with a good digital camera. I used a Canon EOS 500D DSLR with 16MPixel resolution.
- As DSLR optic a 100mm lens with fixed-focale length is used ("CANON MACRO LENS EF 100mm 1:2,8 USM"). Distance to screen is about 2 meters. Use a tele-range lens, else the screen images may get warped. And don't use a zoom! These have too many glass elements inside, impacting picture quality.
- The fiche carrier of the Agfa reader is moved by an "ISEL" industrial CNC x/y positioner with stepper motors (Thanks, Thomas!) The positioner is controller over RS232 in a propietary protocol.
- A PC computer controls the positioner, triggers the DSLR and reads back the image and archives them. Any model with one RS232 and two USB 2.0 ports is usable.
- Pictures from the DSLR are read to PC over USB cable with CANON's "EOSUtility" software.
- The DSLR is triggered with an USB relay connected to the remote-trigger-cable-input.
- The DSLR has an external power supply.
- Central component is the specialized control program, which calibrates and moves the CNC positioner and operates the DSLR.
- The raw photographed images must be processed by a chain of filters to yield OCRable black&white pages in a PDF document.
What I learned
I learned much while getting the assembly to work.
* Mechnical tolerances of the carrier mover can not be build exact enough. So while moving the carrier, an overshoot is build in.
* For automatic location of the frames on a fiche manual calibration is necessary: In fact a translation between the stepper coordinate system and the logical fiche-grid-system must be calculated. The software let you move the fiche carrier with cursor keys. For calibration, 4 frames on the fiches must be exactly positioned, then the position of those reference frames on the fiche must be given.
* The scanning room must be dark, else the image contrast is too bad.
* On the images, the screen area most be surrounded by an uniform black border, which can be cropped off automatically. Therefore a bezel must be attached to the reader to let it appear wider, and the visible parts of the reader must be painted black.
* The sharpness of scanned images is limited by the grain of the film and the reader's diffusing screen. Sharpness results from 4 sources:
- The projected fiche image on the reader screen must be controlled/adjusted after each fiche.
- After the focus of the DSLR camera is adjusted to the screen, disable the auto-focus. Smallest ISO value must be used (ISO 100), else color noise will appear. The 100mm lens has best quality at a middle aperture of f=10.
- The fiche must placed absolutely flat into the carrier. I used tape strips as adjustment marks, the fiche projections gets unsharp if even one side of the fiche is on the strip, and not between the strips.
- The settings of the DSLR results in a shutter speed of 3 seconds. Moving the heavy CNC positioner causes vibrations in the whole assembly, so after carrier movement a delay of 5 seconds is used to let things come to rest.
* The controller software must also organize the filing of the resulting images. Directories must be created, meta-information must be gathered. The data on the title strip are to be saved for each fiche.
* A lot of final speed and quality depends on the user interface of the controller software. Especially typing in the info from the fiche title strips was more difficult than expected: The codes are cryptic, the room is dark, and the font size may be very small.
* The "Isel" CNC positioner makes a very loud and annoying noise. It must be isolated from the floor, else other people in the same house will complain.
* Use of a DSLR as scanning element puts quite some stress onto the camera. An EOS 500d is rated to do 70.000 exposures, and in fact my one died after 40.000 scans ... just in specs. A used 500d may cost 200€ and may have 50.000 exposures left,, so for 1 € you get 250 exposures ... about the size of a fully occupied fiche.
All in all I operated the scanner for 8 weeks. I scanned 428 fiches with 53545 pages, so the typical fiche has 125 frames, and is filled to 60%. The sum of all raw image sizes is 227 GB. Scan speed is 15 seconds per frame. I work in a home office and could digitize about 10 fiches per day parallel to my regular work.
And I'll think twice before I scan another batch of fiches!