Raster datasets can be very useful in a GIS, and their usefulness generally increases asresolution increases. This however can quickly lead to very large file sizes which can in turn lead to decreased processing speed during analysis and slowed data display. Because raster datasets store a separate value for each and every pixel, there's no way to fix this problem without somehow making the storage and retrieval of this data more efficient. In order to improve data storage efficiency, raster data must undergo some form of data compression.
There are a number of different compression techniques for raster data, though they can be grouped into two major categories: Lossy and Lossless. Lossy compression generally results in much smaller file sizes, but at the cost of permanently lost data and decreased resolution that cannot be reversed. Lossless on the other hand is less effective at compression, but can be reversed so that all the original data in the layer is usable for analysis and display.
Lossless Compression can be used with "LZ77 or JPEG 200" and maintains all the vallues of cells in the raster dataset. It can compress raster data at a low file size ratio. That would be a ratio of 2:1 or 3:1. Because of the small ratio it will decompress more quickley. This is why lossless compression is much more effective for the use of analysis of data sets or deriving other data products. Some keys to knowing when to use lossless compression are if the raster datasets are to be used for deriving new data or visual analysis, if the compression required is only between 1:1 and 3:1, if you don't plan to retain the original data, and if your inputs are already lossy compressed.
Lossy Compression is used with "JPEG and JPEG 2000" and is most effective when used with geographic information systems (GIS) projects in which the raster dataset is simply a background image. It is now sufficient to use for raster analysis like lossless formats are. Lossy compression does not retain the exact values of each pixel. It's main advantage is that it can compress raster datasets at a much higher ratio such as 20:1. However, since this is more highly compressed data the amount of time spent on the decompression of the information may be longer than other methods. Keys to knowing when to use lossy compression are if the rasters are only background images and there will be no analysis on the raster data, for faster data loading and retrieval, or if less storage space is needed.
Wavelet compression is an example of lossy compression, and includes such data formats as MrSID and JPEG 2000. This compression technique recursively examines patterns in the data at different scales and removes values that don't change the overall patterns. Wavelet compression results in very high compression ratios (meaning very small file sizes in relation to the original files), but it permanently simplifies the dataset in order to do so. Block encoding and quad tree data structures, as well as run-length encoding are all example of lossless compression which can be reversed. Run-length encoding finds adjacent row cells with the same value and replaces them with two values, one indicating how many cells share the value and the other storing the original value. Because no data values were averaged or removed, such files retain all the detail of the original while decreasing the amount of storage space required. Pyramid Files are another type of data compression that allow for fast data display at different resolutions.
- Chang, Introduction to Geographic Information Systems (2008), 4th Edition
- Longley, Paul A. et al, Geographic Information Systems and Science(2005), 2nd Edition
|Authors||Sean Young, Ben Hillam|
|Tags||file size, lossy, lossless|