MetaChat REGISTER   ||   LOGIN   ||   IMAGES ARE OFF   ||   RECENT COMMENTS




artphoto by splunge
artphoto by TheophileEscargot
artphoto by Kronos_to_Earth
artphoto by ethylene

Home

About

Search

Archives

Mecha Wiki

Metachat Eye

Emcee

IRC Channels

IRC FAQ


 RSS


Comment Feed:

RSS

24 May 2007

Forget the Monkeys and Typewriters. There's CAPTCHA. "A few simple keystrokes may soon turn blather into books. Researchers at Carnegie Mellon University have discovered a way to enlist people across the globe to help digitize books every time they solve the simple distorted word puzzles commonly used to register at Web sites or buy things online."
Wow. That's pretty darn cool. (And I was really in the mood to go "meh.")
posted by treepour 24 May | 15:28
Err, perhaps I'm missing something. For a CAPTCHA to work, you have to know what the correct text is. If you already know what the correct text is, why bother with this project?
posted by matthewr 24 May | 15:37
From the article: If enough users decipher the CAPTCHAs in the same way, the computer will recognize that as the correct answer.
posted by box 24 May | 15:39
Ah, the "reCAPTCHA" website provides the answer (unlike the excite article which as it stands doesn't make much sense).

But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.
posted by matthewr 24 May | 15:44
"If enough users decipher the CAPTCHAs in the same way, the computer will recognize that as the correct answer."

@matthewr: This seems to be the root algorithm. They may be assigning a degree of confidence to a range of candidates and the user must match a high rated candidate to pass through, and then after a number of trials that candidate with the greatest number of responses is assumed best for purposes of adding to the book.
posted by mischief 24 May | 15:48
That'll teach me not to preview.
posted by mischief 24 May | 15:49
Hmm, after a bit of contemplation, I see that since their application uses two words, the user's work is doubled. Sounds to me more like brute force than elegance.
posted by mischief 24 May | 15:54
I had an idea to do this with census records a couple of years ago. The big problem, computation-wise, is slicing up the text properly. I don't have any background in image interpretation so I didn't do anything with it, though.
posted by stilicho 24 May | 23:00
I have to say that this is an awesome idea (if it works).
posted by philomathoholic 24 May | 23:49
Happy bouncy day-after-date songs on Mecha Radio, || i just got offered a chihuahua?

HOME  ||   REGISTER  ||   LOGIN