Fantastic handheld OCR scanner
Written: Aug 07 '05 (Updated Aug 07 '05)
|
Product Rating:
|
|
| Ease of Use: |
 |
|
|
Pros: Great accuracy, easy PDA connectivity, good desktop interface. Lots of storage.
Cons: Primitive cable connection. Harder to use on thick books. Main menu layout. Dim display.
The Bottom Line: I'll be showing this gadget to everybody in academia. Goodbye, retyping long paragraphs and quotations!
|
|
|
| oldoligarch's Full Review: C-Technologies C-Pen 800C Handheld Scanner |
Why I bought a C-Pen 800c
I am a graduate student and an adjunct professor. I am on the road a lot and I have been looking for a portable solution for my information-gathering needs. Quite often, I come across a valuable snippet of information, a paragraph or two that I would like to retain and cite later in lecture or a paper. The quotation may even be tangential to my research at the moment, but I want to save it because Ive found that if you dont record the citation and source on the spot, the chances of using the material are slim. Writing it out on a piece of paper is cumbersome, and photocopying isnt always convenient or available. And even if it is, I keep everything on the computer because I dont have the time or space to store and hunt for thousands of individual pieces of paper. A direct-to-digital solution therefore seemed best.
Options I dismissed
I first looked at handheld image scanners like the Docupen. I didnt like that unit because the resolution is too low (200x200 dpi at best, in hi-res mode) and even their own sample images belie the fact that the rollers arent great and the scans tend to have fun-house mirror distortion to them: stretched in some parts, compressed in others. Moreover, I research historical theology and philosophy. The books I use are sometimes older and larger. The narrow 8-inch width of the Docupen would mean multiple scans per page -- a nightmare. On top of that, in the high-res mode, the Docupen only holds about 50 images -- maybe enough for a busy day in the library, maybe not. Certainly not enough for a weeks trip somewhere.
Turns out, when I did the math, my Canon digital camera takes higher resolution snapshots than the Docupen takes scans. If you have a 4-megapixel or greater camera, try this: Frame the page of the book in macromode, take a black and white shot, and examine the resolution you get. Then do a calculation of how many pixels per inch of page youve got there. Its probably more than you think. Digital camera have come a long way. Holding the camera steady is another matter. Without a tripod, its hard to get a clear, wiggle-free shot while holding the book flat. And getting the lighting right is persnickety, so the digital camera wasnt a great solution either. And with either camera or Docupen, once I got the scans home, Id have to retype the info or OCR it, if possible. So neither option was great, even though I am fairly well-versed in all these kinds of image-to-text conversions.
Deciding on the C-Pen
I examined the remaining technologies, and a handheld OCR device seemed best. (OCR = optical character recognition.) I found ample reviews of the three major brands (C-Pen, Iris, Wizcom) on epinions.com, amazon.com, and other sites (ZDNet, etc.). Im not going to recapitulate those comparisons here. You can look em up yourself. C-Pen had the best buzz, and I liked the feature set, so I decided to ask for a 800C for my birthday, which my darling wife provided.
I especially liked the fact that OCR happens in real time, and the data is stored as plain text, which radically reduces the storage requirements of the device compared to anything that captures images. The 8 MB of storage on the C-Pen 800C is a massive amount of storage for plain text (but more on that below) -- certainly more than I might ever scan in a day, or even a week, at the library.
Out of the box
I was a little worried by the occasional reports that C-Pen was hard to use. But these reports seem to comprise less than 20% of the reviews I had read, so I decided to go for it. On any tech site, I always mentally allot at least 10% of bad reviews to the technologically naïve or those who couldnt be bothered to read the manual. ;-)
I encountered a minor hitch immediately. You are supposed to charge the C-Pen for two hours before the first use. CPenUSA.com, the US distributor from whom I purchased my model, had shipped me the Europe-only charger -- i.e., the wall plug had two round prongs on it for the 240V AC service you find, for example, in Germany. Turns out, this is how the unit ships to everyone, since the product is made in Sweden. CPenUSA just throws in a simple adapter for US customers, which they forgot to do in my case. I called customer service. A friendly rep answered immediately, quickly figured out what I was missing, and arranged to send me the proper adapter in the next days mail for free. She was gracious; the service was good, and the part and shipping were gratis since it was their mistake.
In the meantime, I went down to Radio Shack and picked up a simple slip-on converter for $4 (cat no. 273-1406D). The AC-to-DC converter block that ships with C-Pen is wired to accept 100VAC - 240VAC input voltage, so if you are ever in the same situation or lose your converter, know that you dont need a fancy rig to make it work on US power. Problem solved.
I plugged in the C-Pen, hit the reset button, and let it charge until it said battery full after about two hours. I could hardly wait to try it.
My First Scan
Im a bit perplexed at the difficulties that some other reviewers had. My first scans were all easy and accurate. From the online reviews, and the manual, I knew two things: (1) hold the C-Pen at a more vertical angle than a normal pen, and (2) scan at a fairly rapid pace. I clicked C Note, grabbed a nearby book, and scanned three lines. Perfect on the first try. There is a short pause at the end of each line as it performs OCR, but if you dont stare at the screen between scans, theres a very minimal wait (1-2 secs) before you can scan again. Also, I noticed that the better you scan the text, the quicker it does OCR. So once you get the touch its really easy to scan a few lines of text in one quick go.
I scanned the back of the Radio Shack receipt -- tiny text on crinkly paper. It missed a few characters, but did fine there too. I grabbed a passel of photocopies of a book I had made yesterday, and scanned some text from there. Again, near perfect. I was really impressed. Within 15 minutes of picking up the C-Pen, I was scanning with it like a charm. Just guessing, Id say I got 90-95% accuracy immediately, on my first attempts. Obviously, you get better after using it a day or two.
So of course I decided to push the envelope immediately. First, I tried a few paragraphs with foreign language terms interspersed. If the characters are regular Latin characters, it does just fine. (In my case, there were a few quotations in German in the theology text I was scanning, which OCRed with just an error or two, all while operating in English mode, not German mode.) If you have some unusual punctuation, the C-Pen doesnt know what to do with it. For example, the dieresis mark in naïve comes out as an apostrophe-i. The macron over an eta in transliterated ancient Greek comes out as an f or t or some such attempt to render the macron as a character with a bar through the middle. Greek text itself, obviously, chokes the C-Pen, although it will try its best to make something out of it. The C-Pen 800C is only designed to recognize the Latin character set and a few extensions depending on what language you select. The 600MX model comes with greater central European / Cyrillic language support.
Pushing further, I took out the killer: a 100-year old book set in a diminutive 10-point font, crowded onto the page. The text is written in German, with liberally interspersed Hebrew, Greek and Latin phrases. The pages on this book are quite yellowed. It doesnt lay flat easily because it is 1000 pages long. The pages are wavy in the typical book way when laid open. This book hurts my eyes to read. I clicked the C-Pen into German language mode, and went at it. It was hard to scan across the closely-spaced lines. (I had not yet installed the line-guide sticker included with the C-Pen as an aid. This handy idea helps. Basically, its a decal that makes a 2mm stripe in the center of the pen tip to keep your scanning centered on the text.) In the segments that were all German, C-Pen did well, picking up umlauts and ess-tsets with no problem. It was a bit tricky to keep the scanner going continuously as I traced across the whole page. If the auto-trigger comes off the page even for a moment, the pen assumes you have a finished scanning a line, and pauses to do OCR, thus breaking off the line youre working on right there. C-Pen then beeps to tell you that the rest of your scanning after the interruption wasnt recognized. A slight incline helps to keep the trigger down, and also a firm grasp. But too much incline and you make the scans too oblique, which impedes good, fast OCR. So it is a bit hard to scan a fat book, but I havent spent any time getting the knack of it. A few reviewers have said that ones accuracy increases once you get a feel for scanning thick books, and this sounds reasonable based on my limited experience. Since Ive only had the pen for a few days, time will tell.
The combination of the lack of contrast (due to the yellowed paper), the wavy pages, the difficult older font and the occasional foreign-language word dropped the accuracy, Id say, to more like 80%. Remember, this result is within 30 minutes out-of-box and in a worst-case scenario. I was still plenty impressed.
To be a bit more analytic, I scanned the following paragraph about Heidegger from a medium-sized new book. It took me perhaps two minutes, tops. There are 623 characters in it, not counting spaces. 11 characters are either misrecognized or dropped by C-Pen. 6 characters are absent due to bad technique: my scans were interrupted mid-line twice, dropping 3 characters each time. 5 characters are OCR flaws. C-Pen missed characters in itself and Thus and things and that. It also added an I before the penultimate word. It scanned punctuation correctly.
That makes a total of 11/623 = 98.23% accuracy. To give you an idea of what this means in practice, Im including the text. Its rather abstruse, but I picked it randomly from what was at-hand. The notion (GP 163-64/116) is a citation and was recognized perfectly although its just a string of characters. Here is the text as scanned:
It can, of course, be objected that the Greeks do not consider the totality of being to be created. For while what Aristotle called artifacts are indeed made, the Greeks thought that nature itself, the cosmos, was eternal and inoriginate. Still, Heidegger responds, in all making we presuppose a material which, as the raw material of the making, is itse 'f not made. It is that "from which" what is made is taken. Th ius even tl igs within the horizon of production. The very meaning of "matter" springs up within an understanding of Be , Matter is thal which things are made and that which offers resistance to the production (GP 163-64/116). The horizon of production then seems to have a universal scope and appears to apply to the totality of I Being.
A Second Example, Two Days Later
For another real-life example, I scanned this paragraph about the German philosopher Gadamer from some photocopies. I proceeded slowly and carefully. The text contains 1073 characters including spaces. As you can see, C-Pen missed a single space and added a wavy bracket where there was a stray mark on the page. It read re-learn as re-leam which I didnt even notice until I spell-checked this article. Everything else was flawless. 3/1037 = 99.7% accuracy.
Note that C-Pen does not pick up italics, underlined titles, etc.:
As several chapters in this collection show, Gadamer has refused to participate in a forced "science or art" choice: he has insisted on interpreting all knowledge, if to varying degrees, as achieved in cooperation between observation and interpretation, science and art. In The Enigma ofHealth, he speaks a language of medicine as science and art, where these elements work together to produce medical understanding. For him, there is no problem about how to incorporate experiential knowledge - how to fuse "art" and "science" - because they work together, dialogically, even though the mix will vary from one medical situation to another. Indeed, for Gadamer the "most fundamental" question is "what contribution science makes to the art of medicine" (1996, 129, my italics). The word order is instructive: this is a language we members of science-imbued societies, patients and doctors alike, have to re-leam in order to reclaim understandings that both patient and physician bring to } clinical consultations, which now need such complex strategies to legitimate them.
IR-beam to PDA: Fast, Easy, Flawless
I read a lot of messages about C-Pen working with older PDAs only with difficulty. The problem seems to be confined to Pocket PC / Windows CE OSes before 2002 and relates to IR protocols (IR Obex). I have a Dell Axim X50v which runs Pocket PC 2003. Transferring scans to the PDA was as easy as it gets.
After I scanned my first few notes, I saved them as files in the Notes directory on the pen. I clicked Beam and pointed the C-Pen to my Axim. The Axim recognized the incoming beam and asked if I wanted to accept the file transfer. While the Axim waits, it keeps downloading the file from the C-Pen into a buffer. Upon my confirmation, it saved the file as a text note in the root directory of the on-Axim storage memory. Transfer of a 2K text note took a second or two. It was already beamed across by the time I tapped accept to confirm the transfer. The file name remained the same as what I assigned when I saved the note on the C-Pen.
Ive beamed a dozen files now, and its a piece of cake each time. The Axim did not need any prompting to look for an incoming IR beam, nor did I need any special software (e.g., Peacemaker to navigate two different IR protocols.) Pocket PC 2003 seems to be IRObex compatible, which is all you need for C-Pen to work.
Navigating the Interface
The jog-dial on the butt end of the C-Pen makes it easy enough to navigate the on-screen interface. My MP3 player has the same style button so Im already very used to it. I am glad I got the 800C with a larger screen. Even at the brightest setting, the LCD output is a bit dim, so the larger graphics are helpful.
The default ordering of the icons is a bit annoying, since I dont ever plan on using the address, calendar, dictionary or messaging functions. Thats what a PDA is for. But to get to Settings and Info, I have skip over all these. Also clicking up from the top menu item doesnt flip you over to the bottom of the menu; you always have to click all the way down through the items. A minor, cosmetic issue, but still annoying. There is third-party software to reorder the icons on the main screen at Paul Cooksons website: http://www.cookson.demon.co.uk/App_Organiser/app_organiser.html. There is also a spate of developmental software at C-Pen Plus: http://www.benlo.com/cpen/software.html. I have yet to try any of these.
C-Direct
I installed the software that comes with the C-Pen so I could scan directly into any app in my desktop Windows PC. To connect the pen, one uses the custom cable that is a power supply & serial port cable in one. This cabling technology is a bit dated. I also dont like the fact it is custom. All that means is that if the cable is lost or damaged, I pay up the wazoo on eBay to find one three years from now. Since USB can trickle charge a device and isnt nearly as clunky an interface, I dont know why they dont update this feature. But Ive got a free serial port on my PC, so no worries here. Even if you dont have a serial port on your computer (many modern laptops dont), all you need is a serial-to-USB adapter, priced $20-$40, available anywhere.
The software install was easy enough. Not very elegant -- it took two reboots before it was finally done. I run Windows 98SE. The installation made two icons on my desktop: C-Direct and C-Pen Explorer.
I tested the C-Direct function on three apps: MS Word 97, Eudora Pro, and Scholars Aid (an academic note-taking software, like EndNote). All three apps accepted text directly from the pen without any problem. I managed to hang MS Word 97 on the first try because I terminated the C-Direct program in the system tray without telling it to disconnect from the pen, which you are supposed to do first.
The software also allows you to access the C-Pen as a peripheral device via Explorer. This makes it easy to drag & drop the numerous small files you may have collected in your notes directory. The C-Pen appears in your Explorer window much in the same way as the My Pocket PC device icon that appears when you dock your PDA in windows. The C-Pen interface is actually bit better than the My Pocket PC interface, since you dont have to convert files upon transfer (like you do with Pocket Word docs, Pocket Excel spreadsheets, etc.) and you can edit the text files directly on the pen itself in a Notepad-like program provided with the software. This saves dragging the file over to the desktop, editing it, and dragging it back to C-Pen to make minor changes.
Things I Havent Read in Other Reviews:
1. The Deal with the One Free Dictionary
In a word: The free dictionary is for the translation/thesaurus feature only, not for dictionary-assisted OCR. If thats opaque to you, read on.
Before I bought the C-Pen 800C, I read the above-mentioned review sites, and also the manual which was available online. The most confusing thing I encountered was all the talk about the one free dictionary you get. The C-Pen comes with your choice of one free dictionary, and you can buy and install others for about $10 each. C-Pens offer two applications which are dictionary-based. (1) If you scan a word, C-Pen will immediately look up its definition for you or act as a thesaurus. (2) You can use C-Pen as a primitive, Babel-fish-like, word-for-word translator. For example, if you have the Spanish-English dictionary installed, you can scan a text in Spanish and C-Pen will show you an instant word-for-word translation in English on its screen. Admittedly, a cool feature.
But my big question was this: Most high-end OCR programs use look-up dictionaries to confirm their guesses when characters are ambiguous. Is the one free dictionary the same dictionary used for dictionary-assisted OCR, or is it just for the thesaurus / translation feature? Because I routinely scan texts in English, French and German, I did not want to have to buy and install a bulky dictionary for each language, and to constantly swap them in and out of memory because all three cannot fit together in the 8MB of memory you get on the pen.
The good news: For OCR purposes, the C-Pen 800C comes with English, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish OCR-assisting look-up dictionaries already in ROM. These dont take up your 8MB of RAM storage memory. You can switch the native language for dictionary-assisted OCR with the click of a button. The one free dictionary you get as a bonus is only for the translation / thesaurus feature, not for dictionary-assisted OCR. Discovering this was a huge joy for me. Since I dont have any use for the translation / thesaurus feature, Ill probably delete the Websters English dictionary that comes pre-loaded on US models, to free up memory (see below.)
Also note what I said above: When scanning mixed-language texts, C-Pen did a very good job on German words embedded in otherwise English text. The only difference between scanning German when set to English vs. scanning German when set to German is an increased accuracy in characters special to German: umlauted vowels and the ess-tset. The same probably holds true for accents-au-grave and circumflexes in French.
2. A note on actual vs. usable memory
The C-Pen ships with a default unactivated dictionary (at least to US consumers), which means that the C-Pen has 4.55MB available storage out of the box, not 8MB. I havent gotten around to figuring out how to reclaim my extra 3.5MB. As said above, the nine dictionaries used to assist OCR are a separate matter and dont count against your 8MB.
I also will relate the bitter experience of one reviewer who found out the hard way that C-Pen needs 1 MB or so for temporary storage of scanned images. Apparently, he tried to install two purchased dictionaries (3.5MB each) on the pen, figuring hed have 1MB left for text storage. Not so. He couldnt scan because the pen needs room for temp files. So you only really have something like 6.5MB of space to use of that 8MB. Even so, 1MB of plain text is hundreds of pages -- about 240 pages by my sloppy reckoning unless they use unicode, in which case 120 pages.
3. C-Write / On-pen Editing
Say you scan a paragraph, notice a mistaken character and want to fix it before saving your note file. Easy. One click gives you an Edit option. You can navigate to a place in the text and add or delete characters. Navigation is easy. Character entry is hard. There are two means: (1) Navigate through a scrolling bar of capital letters, small letters, numbers and symbols (slower but easier), or (2) Use the C-Write feature which is basically an implementation of letter-by-letter handwriting recognition. C-Write works the same was as with most PDAs: you learn to trace out a basic set of characters, somewhat specialized compared to normal handwriting, and the C-Pen makes a guess at what youve written. The nice thing is that option (1), the scroll bar, also shows you how to trace the character you want so you can learn how to use C-Write as you go.
Note: It look me a while to figure out two things. First, rotating the pen horizontally makes your letters come out crooked. Keep the tip straight up and down, i.e. its long edge parallel to the left margin of the page. Second, the vowel e is hard to make distinct from the vowel o with a slash through it (a Swedish character). To get e, make sure your circular arc doesnt intersect the straight line in the middle. A bit tedious, but I havent found a handwriting-recognition interface that isnt.
Conclusion
I love it. Works great right out of the box. Scans well in multiple languages. Interfaces easily with my PDA. Works directly with the Windows apps Ive tried. Im very happy with this product.
Recommended:
Yes
Amount Paid (US$): 210
|
|
|
|
Epinions.com ID: oldoligarch
|
|
Reviews written: 12
Trusted by: 0 members
|
|
|