Parent document is top of "JPEG image compression FAQ, part 1/2"
Previous document is "[14] Why all the argument about file formats?"
Next document is "[16] How does JPEG work?"

[15] How do I recognize which file format I have, and what do I do about it?

If you have an alleged JPEG file that your software won't read, it's likely
to be HSI format or some other proprietary JPEG-based format.  You can tell
what you have by inspecting the first few bytes of the file:

1.  A JFIF-standard file will start with the four bytes (hex) FF D8 FF E0,
    followed by two variable bytes (often hex 00 10), followed by 'JFIF'.

2.  If you see FF D8 at the start, but not the 'JFIF' marker, you may have a
    "raw JPEG" file.  This is probably decodable as-is by JFIF software ---
    it's worth a try, anyway.

3.  HSI files start with 'hsi1'.  You're out of luck unless you have HSI
    software.  Portions of the file may look like plain JPEG data, but they
    usually won't decompress properly with non-HSI programs.

4.  A Macintosh PICT file, if JPEG-compressed, will have several hundred
    bytes of header (often 726 bytes, but not always) followed by JPEG data.
    Look for the 3-byte sequence (hex) FF D8 FF.  The text 'Photo - JPEG'
    will usually appear shortly before this header, and 'AppleMark' or
    'JFIF' will usually appear shortly after it.  Strip off everything
    before the FF D8 FF and you will usually be able to decode the file.
    (This will fail if the PICT image is divided into multiple "bands";
    fortunately banded PICTs aren't very common.  A banded PICT contains
    multiple JPEG datastreams whose heights add up to the total image
    height.  These need to be stitched back together into one image.
    Bailey Brown has some simple tools for this purpose on a Web page at
    http://www.stnetcom.com/~bailey/photo-jpeg/photo-jpeg.html.)

5.  If the file came from a Macintosh, it could also be a standard JFIF
    file with a MacBinary header attached.  In this case, the JFIF header
    will appear 128 bytes into the file.  Get rid of the first 128 bytes
    and you're set.

6.  Anything else: it's a proprietary format, or not JPEG at all.  If you
    are lucky, the file may consist of a header and a raw JPEG data stream.
    If you can identify the start of the JPEG data stream (look for FF D8),
    try stripping off everything before that.

HSI files used to be rather common in alt.binaries.pictures.* postings,
although thankfully they have gotten less so.  You can spot an HSI posting
by looking at the first few characters of the uuencoded data.  The
characteristic HSI pattern is
	"begin" line
	M:'-I ...
whereas standard JFIF files begin with
	"begin" line
	M_]C_X ...
If you learn to spot the HSI pattern, you can save yourself the trouble
of downloading unusable files.

At least one release of HiJaak Pro writes JFIF files that claim to be
revision 2.01.  There is no such spec; the latest JFIF revision is 1.02.
It looks like HiJaak got the high and low bytes backwards.  Unfortunately,
most JFIF readers will give up on encountering these files, because the JFIF
spec defines a major version number change to mean an incompatible format
change.  If there ever *were* a version 2.01, it would be so numbered
because current software could not read it and should not try.  (One wonders
if HiJaak has ever heard of cross-testing with other people's software.)
If you run into one of these misnumbered files, you can fix it with a
binary-file editor, by changing the twelfth byte of the file from 2 to 1.

If the file header seems valid, but your decoder chokes with a complaint
like "Unsupported marker type 0xC2", then you have a progressive JPEG file
and a non-progressive-capable decoder.  See part 2 of this FAQ for
information about more up-to-date programs.

Parent document is top of "JPEG image compression FAQ, part 1/2"
Previous document is "[14] Why all the argument about file formats?"
Next document is "[16] How does JPEG work?"