Hi,
I am new to the list, quite new to Runeberg at all actually. I first discovered the site two years ago when my interest in old Norse history grew, and i had bought my own copy of Salmonsens Konversasjonsleksikon (for a mere 1000NOK!), and to find that someone actually was taking the time to scan this masterpiece was quite amazing. Now Runeberg keeps growing and more and more literature is added. Now I have come across some material that I feel should be shared with others, and runeberg is a perfect place to make literature available to everyone. I have a normal A3 scanner so I might not be able to scan the pages perfectly straight, and that brings me to deskew. What deskew tool should one use, and I hope there are some free alternatives out there. Or maybe there is an option like this in photoshop or other wide-market application? If everything goes as planned I hope to be able to upload something in a couple of weeks, getting hold of these books was quite a struggle... I am also so fortunate that they have a license of ABBYY FineReader at my fathers workplace, so I should be able to supply OCR of the pages as well.
Anyhow, I appreciate all your hard work.
Greetings Simen Bjelke
Simen Bjelke wrote:
straight, and that brings me to deskew. What deskew tool should one use, and I hope there are some free alternatives out there.
First let me say that deskewing is not absolutely necessary, even though it makes the result look nicer. In many cases we don't deskew, but publish the slightly skewed image. We also publish the OCR text, and the OCR program (I personally use ABBYY FineReader) internally deskews the image for better recognition, but we don't use its output image, only the output text.
Having said that, we all come from different backgrounds. I'm a C/Unix programmer and so are other members of our core team. If you use Microsoft Windows, I don't know much about which programs are available. One very useful shareware/freeware program is IrfanView that can do most image manipulation operations on single images or entire batches. It can do fine rotation, but it doesn't automatically recognize how much a page is skewed. This means you have to adjust the rotation manually for every image.
Since you ask for a "free" alternative, I suspect you might be interested in programming and algorithms. The problem of deskewing falls into two parts: skew detection and fine rotation. For fine rotation there are many programs, such as pnmrotate or ImageMagick's convert. For skew detection I have found a subroutine which is part of the ClaraOCR free software OCR engine. While ClaraOCR isn't very useful, this subroutine can be isolated and used for skew detection. Tell me if this sounds interesting.