Saturday, August 28, 2004

Distributed Proofreading

Once again, Slashdot provides a link to a really cool idea. Going back quite a ways, there's been Project Gutenberg. Excellent quality, public domain e-books, made freely available. Great idea, and great classics. Their website has their book count up to over 12,000.

Back in the bad old days (ca. 1971), people would key these in by hand from paper. The Declaration of Independence was first. Well, that soon gave way to scanning documents, leaving the problem of finding people willing to scan in a book, and then proofread their scans.

So back in 2000, apparently, Charles Franks started up Distributed Proofreaders, with the idea that people could pull down, one page at a time, the scan image and the OCR'ed text, and compare the two, making corrections as needed. You could proofread as many pages as you had time and/or interest, and the whole process could be done by anyone with a web browser.

Slashdot, a day or two ago, reported that DP had released its 5,000th book to Project Gutenberg, and that story got me to take a look at it. Neat stuff. I'm all the way up to a total of 8 pages proofed, putting me at 8,305 out of 13,402 volunteers.

The whole distributed, using-the-net thing is a wonderful idea, to me, and this is just one more coolness to come out of it. With me and my hobby-du-jour issues, I have no idea how long I'll do this, but it's a pretty cool thing to do.

No comments: