The Big Box of Art on Linux

I recently visited a garage sale and purchased an old clip art/stock image collection from "Hemera" called The Big Box of Art. While the (windows) software itself provides an index and a browser, I was only interested in using the images and none of the software. Unfortunately, the image files were in a passel of zip files and included two formats I was not sure how to handle; Windows Metafile (*.wmf) and "HPI" files.

HPI files turned out to be JPEGs with 32 bytes of unnecessary garbage prepended. They were easy to process. Most programs require an input and output file argument, so usually I write a script which does the processing. Here's the script in this case, which runs od:

#!/bin/sh
BASENAME="`basename ${0}`"

if [ "x${1}x" = "xx" ]; then
  dd bs=1 skip=32
elif [ "x${2}x" = "xx" ]; then
  dd if="${1}" of="`echo ${1} | sed 's/\.hpi$/.jpg/'`" bs=1 skip=32
else
  dd if="${1}" of="${2}" bs=1 skip=32
fi

You could put this into $HOME/bin and call it something logical, e.g. hpi2jpg. Note how I produce my outfile argument from the input filename automatically when there is no second argument to the script. If there is no infile name specified, then dd is run without if= or of= arguments and reads/writes to/from standard input/output.

Now all you need to do is run the script. For a few files, in one directory:

for i in *.hpi
do
  hpi2jpg $i
done

Or for a whole mess of subdirectories:

find directory -type f -name '*.hpi' -exec hpi2jpg '{}' \;

The for loop is slightly lighter because it doesn't have to run the find command, but it does not target files in subdirectories unless you specify them on the "for" line. Using the find command with the -exec flag causes the arguments to that flag to be executed for each file which matches your criteria - in this case, it will be files whose names end in .hpi in the directory directory. The token {} (enclosed in single-quotes to prevent the shell from trying to expand it) is replaced with the argument and the semicolon (escaped with a backslash, to keep the shell from performing a string escape) ends the arguments to -exec.

The CD also included WMF (Windows Metafile) files, which are simple vector-based graphics which can be handed directly to the windows printing subsystem. We have little use for them on Linux, and so it is perhaps preferable to convert them to another format. I chose SVG as it is an open standard and most web browsers have support for it today. I wrote a similar script to handle this case:

#!/bin/bash
BASENAME="`basename ${0}`"
INFILE="$1"
OUTFILE="`echo ${1} | sed 's/\.wmf/.svg/'`"
if [ -f ${OUTFILE} ]; then
  echo "${BASENAME}: skip ${1}"
  exit 1
fi

echo "${BASENAME}: convert ${INFILE} to ${OUTFILE}"
wmf2svg -o $OUTFILE $INFILE

Run this script in the same way you'd run the last one (but if you use find, use -name '*.wmf' to convert files; I called it mywmf2svg since it calls the real wmf2svg (e.g. from package libwmf-bin) in order to do the heavy lifting. This script includes a check to see if the output file exists so that I don't overwrite it, because I ran out of disk space in the middle of this script's execution and had to restart it (and this was the easiest way to handle that, if not the fastest.)

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.

Default

  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • You may link to images on this site using a special syntax
  • Web page addresses and e-mail addresses turn into links automatically.
  • To post pieces of code, surround them with <code>...</code> tags. For PHP code, you can use <?php ... ?>, which will also colour it based on syntax.
  • Internal paths in single or double quotes, written as "internal:node/99", for example, are replaced with the appropriate absolute URL or path. Paths to files in single or double quotes, written as "files:somefile.ext", for example, are replaced with the appropriate URL that can be used to download the file.
  • Filtered words will be replaced with the filtered version of the word.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <q>

Issue

  • Lines and paragraphs break automatically.
  • To post pieces of code, surround them with <code>...</code> tags. For PHP code, you can use <?php ... ?>, which will also colour it based on syntax.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>

Drinking Game

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <p> <br> <pre> <h2> <h3> <h4>
  • Images may be embedded like: [image:node_id align=alignment hspace=n vspace=n border=n size=label width=n height=n nolink=(0|1) class=name style=style-data node=id] Leave off any attributes you don't want.
  • [img_assist|...] tags will be displayed, maybe. Please don't make more of them.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.