Friday, May 20: Bandersnatches
SCAN THIS PAGE
by Steven Steinbock
Earlier this week , Melodie shared with us the experience of her bold leap forward into the twenty-first century with a Kindle. I don’t own an e-reader. I probably will at some point. For now I like my books the old fashioned way.
There are some huge advantages of ebooks. For lovers of the short story, in particular, e-readers seem to be a marriage made in techno-heaven. As I think anyone who has looked at the Kindle (or other electronic) editions of EQMM or AHMM – or its sister science fiction magazines – will attest, the Dell magazines have translated very well the e-book medium. Reading short stories and carrying a Kindle or Nook are both well-suited to busy commuters looking for a respite of intellectual escape.
Technology has led to another interesting development in publishing: the sudden availability of long out-of-print books, articles, and stories. E-books and Print-on-Demand has given authors a welcome opportunity to make their own works available again. It’s also made it easier for readers to track down the old hard-to-find classics.
One of the drawbacks, however, has been the appearance of a new kind of typographic error, known by many modern archivists as scannos. A scanno happens when a scanned document is read by Optical Character Recognition (OCR) software and various words or characters are incorrectly converted.
OCR programs are pretty impressive when you think about it. They look at a digital image of a printed page, and analyze the black marks on the page, distinguishing lines, words, and letters, comparing the shapes on the page to letters in its memory, and then checking the resulting words with words in its dictionary. The technology has come a long way since the handheld scanners of almost twenty years ago. (I used to have a Logitech Scanman which, given lots of time and a steady hand, could scan and convert a document with thirty-percent accuracy). But even with advances, computers sometimes incorrectly read text with jarring results.
For instance, it wouldn’t be uncommon to find the word “scan” misread as “sean” or the word “jarring” jumbled up as “jamng.”
I’ve encountered some reprints printed by hack-publishers with little or no quality control. It’s embarrassing. By contrast, I happen to know that Crippen and Landru Publishers, which gets most of its material from pulp magazines and old issues of EQMM and AHMM, makes painstaking effort to catch and correct every scanning error in their books. (C&L may have let some scannos slip through, but if so, I haven’t caught them).
I’ve been scanning a large number of short stories from a variety of sources, and I can testify that the editing process is tedious. Some errors, while amusing at first, become tiresome after the fortieth or fiftieth time they occur in a document conversion.
Punctuation marks change in bizarre ways. Periods acquiesce into nothingness, while quotation marks evolve into double-apostrophes. Accidental ink marks and imperfections in the paper result in all manner of dots, dashes, slashes, and hyphens. “I” and “l” (lower case “L”) get switched around, and often both are replaced with “1” (numeral one). Here is a sample of some of the scannos I’ll encounter in a typical editing session:
fireworks becomes Artworks
in becomes an
thing becomes th1ng
like becomes hke
tiny becomes tmy
carrying becomes caning
In one bizarre metamorphosis,
fireflies became leeches
Quotation marks (“), as I said, frequently transform into double-apostrophes. But sometimes the transformation takes an uglier form:
“Come was read as become
“Not became Aint
“Fire transformed into barrel
notice.” became novice’s
deserved.” became deservedly
I think one of the strangest scannos I encountered was when the word
The
was rendered as
”1”11e
Have you run into any wild scannos? If so, share them below.
A good OCR program will attempt to match words against a dictionary, but of course they don’t consider context which can result in correctly spelled incorrect words.
Steve, some time back you recommended The Old Man in the Corner and I managed to find two of the three books in Gutenberg’s files. Unfortunately, their Australian branch scanned one of the books and ‘scannos’ appeared, perhaps not as often as once per chapter, but still annoying when it interrupts the flow of the story.
Leigh’s example is a good one. Any flaw in the text that interrupts the flow for the reader is regrettable. The sad thing is that these can be avoided if only someone would actually read it before putting it out in the world.
Last year Houghton Mifflin began developing software that would help fix scannos. I wonder if they ever finished. The better (faster and cheaper) solution would have been to give a red pen to a seventh grader.
Hmmmm know I understand all the errors that occur in my dictations since they use voice recognition. :p