Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#338037 - 06/10/2010 23:38 How to edit a multi-page TIF file.
tanstaafl.
carpal tunnel

Registered: 08/07/1999
Posts: 5539
Loc: Ajijic, Mexico
I have scanned into my computer a 170 page document. I saved it as a PDF file, then used Adobe Acrobat to save the PDF file as a Microsoft Word document (not too successfully, it saved a blank page between each "real" page but that could be fixed easily enough I guess) and also as a multi-page TIF file.

It is the TIF file I am most interested in. The original document was about a 4th generation copy with pages mis-aligned (I straightened them with the scanner software) and lots of noise and lack of contrast. Some pages were written on. If I could edit each page as a graphic, I could improve the look enormously.

I have only a few graphic editing tools at my disposal: MS Paint, Paint.Net, and Photoshop Elements. Paint.Net is my tool of choice. However, none of these tools will allow me to see and edit any but the first of the 170 pages.

Am I overlooking something obvious here, or do I need different tools?

tanstaafl.

edit: I tried saving the PDF file as a .PNG file, and that saved the document as 170 separate files. That means I can edit each page individually, but is there any way to concatenate them all back together as a single document when the edits are done? I don't relish trying to print 170 separate files on 85 double-sided pages.
_________________________
"There Ain't No Such Thing As A Free Lunch"

Top
#338038 - 07/10/2010 02:59 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
gbeer
carpal tunnel

Registered: 17/12/2000
Posts: 2665
Loc: Manteca, California
In the case of a pdf file containing pages that are images; the current version of Acrobat Pro will automatically straighten all the pages, if you run OCR on them.

Pro will also allow the compiling of many files into one doc.

For something less costly than Acrobat Pro.
Get copies of Ghostscript and ImageMagick.

GS has a lot of tricks built into it. But you have to be willing to dig into the docs and learn the command line options.

IM has stuff in it to manipulate images and can convert .png to .pdf. Don't remember if it can convert multiple .png's to one pdf, but I'd bet that way. Again, it's all command line stuff.

edit: I'm pretty sure IM can split the tiff file into it's component images.


Edited by gbeer (07/10/2010 03:01)
_________________________
Glenn

Top
#338044 - 07/10/2010 11:51 Re: How to edit a multi-page TIF file. [Re: gbeer]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
There should be free programs that can distill (adobe lingo) multiple images into a single PDF file. I don't think that will be a problem at all, so your best bet is probably to deal with the individual PNG files for now, rather than the multi-page TIFF files. You likely won't find many programs other than FAX software able to properly deal with multi-page TIFF - and then likely only being able to read, not edit.

On a Mac, you could likely use the built-in Automator with a couple of simple actions to put all those images into a PDF. wink

Here's one program:

http://www.pdfill.com/pdf_tools_free.html

And here's another program:

http://www.Derring.com/swiftpdf/


Edited by hybrid8 (07/10/2010 11:56)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#338049 - 07/10/2010 12:52 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
mlord
carpal tunnel

Registered: 29/08/2000
Posts: 14478
Loc: Canada
Originally Posted By: tanstaafl.
I tried saving the PDF file as a .PNG file, and that saved the document as 170 separate files. That means I can edit each page individually, but is there any way to concatenate them all back together as a single document when the edits are done?


A point-and-click way to do it, is to open a new LibreOffice document, and just insert each PNG into it, one per page. Then save the result in the format of your choice (eg .DOC or .PDF), and print it.

With 170 pages, that could take an hour or two, but it's a one-time thing.. right?

Otherwise, this could be scripted and completed within 10-15 minutes using commonly available tools on Linux.

Cheers

Top
#338050 - 07/10/2010 12:54 Re: How to edit a multi-page TIF file. [Re: hybrid8]
tanstaafl.
carpal tunnel

Registered: 08/07/1999
Posts: 5539
Loc: Ajijic, Mexico
Originally Posted By: hybrid8
Perfect. Thank you, Bruno.

What are advantages/disadvantages of PNG vs JPG? One article I read said that the PNG file has the potential to be of better quality than the JPG, but at the cost of larger file size. Yet, I saved the attached screenshot in both JPG and PNG formats, I cannot see any difference between the two even zoomed in so everything is pixelated, and the PNG file is 1/3 the size of the JPG.

tanstaafl.


Attachments
PNG to PDF.jpg

Description: This is the JPG version of the file.

PNG to PDF.png

Description: This is the PNG version of the file.


_________________________
"There Ain't No Such Thing As A Free Lunch"

Top
#338051 - 07/10/2010 13:02 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
mlord
carpal tunnel

Registered: 29/08/2000
Posts: 14478
Loc: Canada
PNG, like TIFF, is totally lossless. It's like a zip'd copy of the original image file.

JPG, is lossy compression. It deletes information from the original image, and compresses the result, giving a much, much smaller file. But it can never recreate all of the original image data.

[EDIT:]When creating a JPG file, one can specify the degree of data loss, aka "quality". Using 100% means no loss, but that's pretty rare. To get small files, one generally specifies 75-80% quality.

Cheers


Edited by mlord (07/10/2010 13:06)

Top
#338052 - 07/10/2010 13:05 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
JPG is primarily used for continuous tone images such as photographs, which will better hide any artifacts from the lossy compression. If you use JPG on flat areas of colour with detailed text, you risk seeing some artifacts over your text.

PNG can be had in two formats, index colour, which supports up to a 256 colour palette, and full 32bit which has 24bits of colour (16.7 million like JPG) and 8bit alpha (256 levels of transparency). Both formats are non-lossy meaning the compression they use does not alter the pixel information when they're displayed.

An index PNG is great for when you have relatively few colours in your image, including large areas of the same colour, such as interface images like the one you posted. A truecolour PNG (32bit) is great as a master copy of artwork like photos or paintings or any time you need smooth transparent gradients.

Keep your source files and you can generate a PDF multiple times over and compare to see which one you prefer.
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#338058 - 07/10/2010 16:46 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31565
Loc: Seattle, WA
Yeah, you can't use a Windows Screen Shot as an estimator for image quality or disk space savings in file format comparisons.

A screenshot is a special case: lots of large flat areas of the exact same pixel color. For those, you want a file format that will handle those sorts of things well: TIF, RLE, GIF, PNG. Those can datacompress the "unchanging" areas without any data loss, essentially like "zipping" the raw image file. In fact, saving a screen shot as a raw BMP file and then zipping it probably produces the greatest file size savings.

As soon as you start talking about a photograph, that's when things get fuzzy. Because a phototgraph never has two pixels next to each other that are exactly the same color. So you can't just datacompress the image, most data compression algorithms see that as purely random uncompressible data. That's why we have lossy algorithms like JPG. It changes the image each time it compresses it, but preserves the overall general visual appearance of the image. JPG will do poorly with a screen shot of a dialog box: You'll get swimmy blobby (faint) JPG artifiacts around window edges and fonts, and a larger file in the end.
_________________________
Tony Fabris

Top
#338092 - 08/10/2010 19:28 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Just split the multi-page TIFF into multiple single-image TIFF files. Download Imagemagick and run "convert multipage.tif single%d.tif". To convert it back, you can probably run "convert single*.tif multipage.tif", but I haven't tried that.

Don't convert them to PDF or Word in order to edit them. That's, at best, adding additional crap to the images, and probably resizing the images and losing data.

One of the image viewers in Windows knows how to deal with multi-page TIFFs (I believe it's whichever one opens .tif files by default), and you should be able to use that to print the whole thing once you're done.


Edited by wfaulk (09/10/2010 03:53)
Edit Reason: clarify pdf/doc discommendation
_________________________
Bitt Faulk

Top
#338097 - 08/10/2010 21:05 Re: How to edit a multi-page TIF file. [Re: wfaulk]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
A PDF is a container and can include many different image formats. It's a much better idea to use a PDF than a multi-page TIFF which has very little software support, and certainly no support from any e-reader software or hardware.
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#338098 - 08/10/2010 22:45 Re: How to edit a multi-page TIF file. [Re: hybrid8]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
PDF is not a container format (unless you just mean that it can contain arbitrary contextless binary data, which it can, not that any PDF viewer will know what to do with it), and it can contain only a few different image formats (sort of; it's really just image compression algorithms). Notably, its only TIFF support is CCITT fax, which only really specifies 1-bit color depth. It also has a few JPEG encodings. Beyond that, it's straight raster images, potentially compressed with RLE, flate, or LZW. (To be fair, a TIFF is likely to also just be LZW-compressed.)

Specifically, most PDF creators give you very little (obvious) control over how an image is going to be imported. It might decide to use one of the lossy JPEG algorithms. Even if it imports it losslessly, it might decide to place it as an image inside a page, which would screw up formatting.

So, yes, a PDF could be used to keep an identical copy of the original TIFF data, but making that happen is problematic. If the intention is to edit the images, you don't want to use a program that might lossily compress the data, then edit it in another program that might lossily compress it again.

As a final rendering, yeah, it's fine, as long as the results look okay to you. Using it as an intermediary image, though, is fraught with downfalls.
_________________________
Bitt Faulk

Top
#338099 - 09/10/2010 00:28 Re: How to edit a multi-page TIF file. [Re: wfaulk]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
No, it's not a container like a Quicktime or MKV. But it can contain image data that can easily be parsed out by programs that don't necessarily render other parts of a PDF file. Both bitmap/raster and vector based.

Definitely not ideal for an intermediary - at least for raster based images. It works fine as an intermediary to store vector images and text however, though not the ideal solution for text IMO.

I generally keep source files around for everything, but I will often publish (collections of) both raster and vector-based images to PDF, as well of course text-based data as a final (submission/presentation/distribution) format.

But once Doug is done his edits, storing high-quality images combined in a PDF is an absolutely fine solution for moving all those images around in a single file. It's a much more portable solution to holding the TIFF or JPG data.

Though if the content is truly all text based, the ultimate solution is to run OCR against it to produce a clean text document which can be stored much more efficiently. A nice little program on the Mac, PDFPen includes OCR built in, so when you load up an image-based PDF, it can run OCR on it automatically if you so choose. There are obviously enough solutions available for Windows to make this practical there too, even if run first against the TIFF images.


Edited by hybrid8 (09/10/2010 00:28)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#338107 - 09/10/2010 21:16 Re: How to edit a multi-page TIF file. [Re: wfaulk]
tanstaafl.
carpal tunnel

Registered: 08/07/1999
Posts: 5539
Loc: Ajijic, Mexico
Originally Posted By: wfaulk
Don't convert them to PDF or Word in order to edit them.
Bitt, you misunderstand just a little bit. The original poor quality pages were scanned and saved as a PDF file (the default for my scanner). Some time after doing that (with the original source material no longer available), I decided I should try cleaning up the pages, which required that they be in some graphic format. So I experimented, saving the PDF file first as a multi-page TIFF file, then (just to see if I could) as a DOC file, and finally just a few days ago as a PNG file. When I saved it as a PNG, it saved it as 170 separate files, each one editable. When I have them all looking as best I can (which will still be pretty awful) I'll put them back together as a PDF file so I can print them.

OCR won't work in this case, the original is too damaged. Some of the pages were skewed so badly that part of the text was missing and had to be reconstructed, and much of the remaining text is distorted and fragmented with parts of letters very light or missing altogether. There are lots of places where there are dark smudges. It is readable by a human because missing data is provided by the context of the sentence, but OCR would fail. In addition, there are lots of drawings, diagrams, text boxes, etc.

What I am trying to do here is salvage and re-create a used teaching workbook that is now out of print for my Spanish language teacher. Fun times!

The two attached files will give you and idea of what I am up against.

tanstaafl.


Attachments
Before.png (519 downloads)
Description: Before...

After.png (465 downloads)
Description: After...


_________________________
"There Ain't No Such Thing As A Free Lunch"

Top
#338110 - 09/10/2010 21:57 Re: How to edit a multi-page TIF file. [Re: tanstaafl.]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
Oh. I misparsed your first sentence. I thought you saved the scan as a PDF and as a TIFF, but you converted the PDF into a TIFF and a DOC.

Gotcha.
_________________________
Bitt Faulk

Top