My thoughts on Slackware, life and everything

Day: October 4, 2011

Digitizing my paperback books

Part one: what do I want?

I hinted at this topic in a previous post. I have a big collection of (mostly) paperback Science Fiction books – some hardcover books too. I used to read a lot more in the pre-Internet days, nowadays it’s just during my holidays that I get enough time to read whole books in a short enough time… so many of those old paperbacks are 20-30 years old and yellowed.

In this digital age it would be appropriate to have digital versions of my books and save them from crumbling to dust. I am in anticipation of Sony’s new e-reader, the PRS-T1 which I want to buy once it is out:

This is a very nice device. It is also a lot cheaper than the previous generation Sony e-reader (the PRS-650) while at the same time adding wireless connectivity. This device needs content once I have it in my possession.

A lot of the “newer” books, and those written by contenporary authors can be purchased online, or downloaded from fan sites where people scan their own collections into EPUB or MOBI e-books.  That is all good and well, but on my bookshelves I have many dozens good books that will probably never see a new life as an e-book. That is very unfortunate… I had a lot of fun reading them and do not want to see them go into oblivion.

I decided to do something about this. I am going to try and describe (and hopefully implement) how I am going to digitize my book library. Note: at the moment this is all just ideas, “dreams” if you wish, although I have already found quite a bit of information on the Internet that I will be sharing with you. I want it to be more than just a dream.

What does one need to get a paper book converted into an e-book?

  1. the book’s pages need to be scanned
  2. the scanned bitmaps may have to be cleaned-up digitally (enhancing the contrast between characters and background, de-skewing or rotating the text blocks, …)
  3. I need an Optical Character Recognition (OCR) program to convert the bitmap images into character text
  4. I need an e-book editor to layout the bare text that I got from the OCR program – the ebook has to look largely like the original paper version.
  5. I want to use a library program to make my book catalogue available, to myself of course, to my e-reader device, and possibly to other interested parties.

And I want this to be as low-cost as possible. Any software that I am going to use for this should preferably be Open Source and run on Slackware.

Those are the main topics I will write about. Each of these topics deserves its own separate article. Why is that?

I can already see how this project will confront me with interesting challenges. I am going to write a multi-post story with interlinked articles (this being the first) in order to preserve this hobby project of mine for posterity. Having separate topic articles allows me to split up your feedback as well (heh… I hope I do get some feedback!), so that discussions about, say, scanning techniques will not interfere with talk about what is the best OCR program for Linux.

The articles are not going to be “static” per se. I value your feedback and important new insights will find their way back into the main text.

Let’s see where this ends. It is probably going to take days, or weeks,  to write. It delends a bit on Slackware development – if that picks up speed again, I will have less time for this ebook side show. But for the moment , there is silence in the ChangeLog.txt and I have time to spare.

Eric

KDE security fix, Flashplayer 11, random bla

KDE team have issued a security advisory (CVE-2011-3365) for the KSSL component in KDE 4.6 and 4.7. I have applied the proposed patch to fix the security hole and updated packages for kdelibs are available from my ktown repository, for both KDE 4.7.1 and KDE 4.6.5 (because I intend to keep that release for a while, it works very well with Slackware 13.37).

Direct links to the packages follow, but you can check out any of the available mirrors of course.

KDE 4.6.5:

KDE 4.7.1:

The new KDE 4.7.2 wich is right beyond the corner will have this fix incorporated.

 

Then there is the Adobe Flash Player.

Finally we have a Linux flash player for both 32bit and 64bit that is on the same terms as the MS Windows version. Yesterday, Adobe announced the official release of their Flash Player release 11 for all platforms. Some of you will cheer, others will moan, but nevertheless this is a milestone in 64bit Linux support. It was (in part) because of the availability of 64bit Flash for Linux that I started the 64bit Slackware port in 2008.

I have packages for you here:

Or on any of my package mirrors, like:

 

And Slackware-current, some people are speculating on huge updates in the near future because there has been such a long silence on the update front. Please do not get too disappointed if the amount of updates is not as big as you might hope. Sometimes, there is real life to take care of.

 

And then there was this:

I am pondering about another blog post, but the idea has not yet finalized in my mind. What it boils down to is, how should I digitize my rather big library of vintage Science Fiction books? I have many tens, maybe a hundred books that certainly will never be released in digital format, and I am looking at the tools to make the conversion. Slackware packages for all the (OCR and scan cleaning) software that I think I will need have been compiled but I hesitate to release them. Mainly because I have not yet tested them myself… ideas are welcome, especially ideas about how to go about the scanning process (I do not want to cut up my books). More to follow!

Cheers, Eric

© 2024 Alien Pastures

Theme by Anders NorenUp ↑