Said and Pearlman[31,30,29,28] have several new lossless and lossy image compression algorithms and programs. The uncompression programs are separate in order to facilitate validating the results. Sp_compress is a lossless progressive transmission method using arithmetic coding. Progressive transmission means that lo-resolution data is transmitted first in the file, then it is refined. This is useful since an uncompresser could be written for previewing a file via a lo-speed communication medium before deciding whether to get the whole file. Sp_compress compressed adir512 to 96,413 bytes, or 2.94 bpp, in 5 seconds.
Progcode is another lossless compression program, which produces a slightly larger file. Its decoder, progdecd, altho not written to uncompress a truncated file, calculates what the mean-squared error (MSE) would be if a lower bit rate were used. The MSE shows that progcode is quite conservative with the compressed file size, presumably to prevent possible roundoff errors similar to what can occur with arithmetic coding. For example, progcode compressed adir512 to 98,993 bytes, or 3 bpp, in 6 seconds. However when this compressed file is decoded at a rate of only 1.2 bpp, progcode reported that the result was identical to the original file. This means that progcode could presumably be modified losslessly to compress this file to only 32,322 bytes, which is a very high compression ratio. In our compression experiments, after running progcode on a file, we often ran progdecd at various bit rates to find the smallest rate at which it reported no differences in the uncompressed file. We name this number the progcode extrapolated rate to indicate that we expect that this rate might be possible with a modified progcode, but that we don't yet have a program that does it.
The obvious question is whether the low progcode extrapolations might be due to some error. However the Said and Pearlman program codetree, described later in the lossy compression section, suggests that the progcode extrapolations are correct. When using codetree, we give it a bit rate, and it writes a file of the corresponding size. Then the separate uncompression program decdtree uncompresses that file, compares it to the original, and reports the mean-squared error.
In this example, codetree at 1.2 bpp compressed the file to 39,322 bytes. Decdtree at that bit rate reported a mean-squared error of 1.45, where the maximum elevation was 1,568. We also converted the uncompressed file to PGM format, differenced it with the original, and calculated the histogram of the absolute differences between the original file and the 1.2 bit per point compression, as shown in Table t:codediffs.
Table 1: Histogram of Errors Caused by Compressing Adir512
With Codetree at 1.2 Bpp
Said and Pearlman's sp_compress and progcode are by far the best. Ha-2 on the binary file is second, far behind, with gzip well behind it. Splitting the file into hi and lo bytes and compressing them with GIF and gzip respectively was better than only gzip. Two conclusions are evident.
Progcode has one restriction: the size of the data array must be a power
of two. This poses a problem with DEMs, which have 20% more than a
power of two rows. The easy solution, for the programmer, is to let the user repartition
his data to suit the program, but a better solution might be to
pad the DEM to the next power of two,
, which
increases the total size by a factor of
.
How efficiently does progcode compress a file with large blocks of
zeros? We made two tests of compressing a file, then padding it with zeros
to double its size, and compressing that. The first file was
adir512; the second adir1024. The latter was
statistically a little different in that it was a subsquare of the DEM, and
not a subsample of the whole, as was the former. As Table t:pad
shows, the compressed padded files were each 16% larger than the
corresponding compressed unpadded file. There are some internal parameters
in progcode, which are now set for medical images. If they were
fine-tuned for the statistical properties of elevation data with large
blocks of zeros, then this 16% padding penalty might be reduced.
Alternatively, instead of padding the file up, it might be partitioned,
quadtree-style, into smaller blocks each of an acceptable size. At some
minimum size, such as
perhaps, we might stop splitting and start
zero-padding. Then, since there is little overhead in compressing small
blocks, each block could be feasibly separately compressed.
Table 2: Compression Efficiency on Zero-Padded DEMs
Since the actual size of the compressed file actually needed by progcode for error-free compression seems to be much smaller than the whole file, we considered that for the first test case. However, the results were comparable. The 512 file needed 1.2 bpp for error-free reconstruction, while the padded 1024 file need 0.35 bits per padded point, or 1.4 bits per original point, or about 16% more.
Sp_compress requires only that the number of rows and columns each be a multiple of four so the padding problem does not arise here. We reported above how well it handles padded files since this is one measure of its optimality.
Table 3: Comparison of Various Compression Methods on Adir512