Image Manipulation with ImageMagick
I've spent a lot of time in my column talking about text processing and analysis, with the basic assumption that if you're using the command line, you're focused on text. That's not always true, and if you work with images at all—whether JPEG, PNG, GIF or another format—there's a free-to-download suite of image-related utilities available that offers rather amazing capabilities direct from the command line and, therefore, also from within shell scripts.
I'm talking about ImageMagick, a set of programs that has grown and expanded through the years and now includes powerful Perl and Ruby interfaces too. But, pshaw! We don't need no stinkin' Perl or Ruby. We'll stick with our hard-core shell commands, thank you very much.
You'll find a downloadable binary and source both at http://www.imagemagick.org, and as always, I recommend you download source and compile it on your system if you can. It's far more reliable than hoping someone else's compiled version is optimized for your own hardware configuration.
A variety of different commands are included with the ImageMagick distribution that I divide into "analysis" and "editing" tools. For this article, let's stick with the analysis tools. Let me start by showing you how much more information it offers on a typical image file than the standard Linux command line.
Analyzing Images for Non-Optimized Resolutions
If you've been using Linux for even a short time, you've probably
learned about the file
command. It can be helpful with some
file types:
$ file wp-content.tar.gz
wp-content.tar.gz: gzip compressed data, from Unix
But, the command is generally useless with images:
$ file pvp.jpg
pvp.jpg: JPEG image data, EXIF standard
Um, what about image size? How about any useful info at all? Jeez.
Enter the ImageMagick identify
command:
$ identify pvp.jpg
pvp.jpg JPEG 970x311 DirectClass 114kb 0.010u 0:01
Ahh...so this particular image has the dimensions (the suite refers to dimensions as the "geometry" of the image) of 970x311. That's useful.
Do you want even more information though? The -verbose
option spits out a
somewhat overwhelming amount of data:
$ identify -verbose pvp.jpg
Image: pvp.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Geometry: 970x311
Class: DirectClass
Colorspace: RGB
Type: TrueColor
Depth: 8 bits
Endianess: Undefined
Channel depth:
Red: 8-bits
Green: 8-bits
Blue: 8-bits
Channel statistics:
Red:
Min: 0
Max: 255
Mean: 180.72
Standard deviation: 74.2122
Green:
Min: 0
Max: 255
Mean: 168.593
Standard deviation: 76.0343
Blue:
Min: 0
Max: 255
Mean: 169.459
Standard deviation: 77.244
Colors: 21864
Rendering-intent: Undefined
Resolution: 72x72
Units: Undefined
Filesize: 114kb
Interlace: None
Background Color: white
Border Color: #DFDFDF
Matte Color: grey74
Dispose: Undefined
Iterations: 0
Compression: JPEG
Orientation: Undefined
JPEG-Quality: 94
JPEG-Colorspace: 2
JPEG-Sampling-factors: 1x1,1x1,1x1
signature: bc8a6a698ca35fd8feab71452423386ff98b1fb7b5ec ...
Profile-xmp: 811 bytes
Profile-exif: 22 bytes
unknown
Profile-app12: 15 bytes
Tainted: False
User Time: 0.020u
Elapsed Time: 0:01
Truth be told, dimensions and resolution are the most useful pieces of information from this crazy-long output.
With a tiny bit of effort, you can extract just those items of information:
$ identify -verbose pvp.jpg | grep -E '(Resolution:|Geometry:)'
Geometry: 970x311
Resolution: 72x72
Now imagine you are working on a Web site and want to ensure that no images on the site are greater than 72dpi, a standard screen resolution. Higher print-ready resolutions are rather pointless, because a 300dpi image will render the same on a screen as its lower-resolution brethren—it'll just load slower.
Here's one way you can identify images in a directory with incorrect resolutions:
#!/bin/sh
identify=/usr/bin/identify
# check images to ensure that they're all 72x72 resolution.
for filename
do
resolution=$($identify -verbose $filename | \
grep -i "Resolution:" | grep -v 72x72)
if [ ! -z "$resolution" ] ; then
echo "Warning: Image $filename has $resolution"
fi
done
exit 0
When I run this on a directory of images on my own system, a set of JPEG format files on my http://www.AskDaveTaylor.com site, here's what I get:
$ checkres.sh *.jpg
Warning: Image auction-seller-img1.jpg has Resolution: 75x75
Warning: Image auction-seller-img2.jpg has Resolution: 75x75
Warning: Image browsing-the-photo-folder.jpg has Resolution: 96x96
Warning: Image brushed-metal.jpg has Resolution: 300x300
...
That's a surprise! I didn't realize that I had 300x300 and these other weird resolutions. An easy way to speed up my site, therefore, is to lower the resolution on these images to the standard 72dpi. This is something that also can be done with a call to a different ImageMagick utility, but let's tackle that in another article.
Working with Image SizeSince I write a lot of scripts that harvest images or other content from sites and repurpose them for my own (generally private, not public-facing) use, I also find it is darn helpful in a shell script to be able to ascertain the size of an image I've just grabbed.
If you've guessed that identify
is the key, you're right. In
fact, given an image, this is an easy way to grab its height and width:
height=$(identify $image | cut -d\ -f3 | cut -dx -f1)
width=$(identify $image | cut -d\ -f3 | cut -dx -f2)
There's no need for verbose output, because the geometry of the image is included in the default output.
Now it's easy to produce higher-quality HTML, for example, by including images with their proper dimensions:
echo "<img src=$image height=$height width=$width>"
What's better is that Web browsers are able to scale images automatically, so if you specify a height and width that are different from the default dimensions (oops, sorry, "geometry") of the image, it'll scale automatically.
This means if I want to include the pvp.jpg image on an automatically generated page, but decide 970 pixels is just too wide, I can simply include it as:
<img src=pvp.jpg height=207 width=646>
and the browser—be it Chrome, Safari or even MS IE—will scale it appropriately.
Calculating the smaller size is straightforward with bc
, another
underappreciated Linux command. The entire sequence might look like this to
scale the image to 66% of its original dimensions:
#!/bin/sh
identify=/usr/bin/identify
scale=0.666
image=$1 # add input validation code
height=$($identify $image | cut -d\ -f3 | cut -dx -f1)
width=$($identify $image | cut -d\ -f3 | cut -dx -f2)
newwidth="$(echo $width \* $scale | bc | cut -d. -f1)"
newheight="$(echo $height \* $scale | bc | cut -d. -f1)"
echo "<img src=$image height=$newheight width=$newwidth>"
exit 0
In practical use:
$ scaledown.sh pvp.jpg
<img src=pvp.jpg height=646 width=207>
That's easy enough!
With some creativity, you can see how even just the
identify
command
that's included with ImageMagick opens up a world of image file scripting
possibilities, whether you're working with Web sites directly or simply
seek to analyze directories of images for unusual values or settings.
I'll dig into some of the really slick editing and modification capabilities, including an easy way to add a so-called watermark to your graphics, along with ways you can automate fixing 300dpi resolution images or even scaling images, in an upcoming article.
As a final note, although I explain how you can take a large image and have it show up smaller on a Web page by using different values for height and width, it would be remiss of me not to mention that if you're going to use only the smaller size, it's smarter to resize the original image. It makes your page faster to load, less unneeded data is transferred and everything just generally is happier (including the search engines). Now you know.