Extract images from PDF

Post Reply
User avatar
Terkelsen
Member
Posts: 126
Joined: Thu Sep 08, 2011 5:08 pm
Contact:

Extract images from PDF

Post by Terkelsen »

Acrobat has a function to export all images from a PDF as JPEG, JPEG 2000, PNG or TIFF. Is this possible using a script with the Acrobat Configurator in Switch?
jan_suhr
Member
Posts: 57
Joined: Fri Nov 04, 2011 1:12 pm
Location: Nyköping, Sweden

Re: Extract images from PDF

Post by jan_suhr »

There is a CLI utility that is called pdfimages that can extract images inside of PDF's

It is a simple tool that can be run with the Execute command tool.

Depending on how the images is saved inside of the PDF you can get it to save the images as .jpg, how ever normally it is saving out images in a RAW-format .ppm but that can be converted to any image format with ImageMagick that you can run as a second step to convert to the format you like.

The program can be downloaded here, it is a package of some PDF utilities and pdfimages is part of the package:
http://www.foolabs.com/xpdf/download.html

Here is a list of options and commands for pdfimages:
http://linuxcommand.org/man_pages/pdfimages1.html


Good luck
Jan Suhr
Color Consult AB
Sweden
=============
Check out my apps
User avatar
Terkelsen
Member
Posts: 126
Joined: Thu Sep 08, 2011 5:08 pm
Contact:

Re: Extract images from PDF

Post by Terkelsen »

Thanks a lot, Jan. I'll definitely have a closer look at this solution.

Meanwhile I found out, that if you let the Acrobat configurator save as XML it actually creates a folder containing all images :o . In the Acrobat preferences you can even choose between JPEG, PNG or TIFF and determine the resolution (and turn saving of an image folder on and off).

If the PDF has been flattened the usual export of images from Acrobat will create a puzzle of small images. I don't know how this will work with pdfimages, but with the saving of XML the images are not split into fractions. The downside is, that text overlapping the images will be included in the images. I solved this by using a Pitstop action to remove text elements before saving as XML.
Post Reply