The Portable Depth Map (PDM) is a simple image format specifically designed for depth images such as those captured by the Intel RealSense cameras or LIDAR sensors. Its design is inspired by the Netpbm family of image formats.
This is the structure of a PDM image:
PDM32 # Optional comment. <width> <height> <width*height 32-bit floats in row-major order containing distances in meters>
Some sample reader and writer implementations are provided here.
A PDM file image consists of a sequence of one or more PDM images. There are no
delimiters, data or padding of any kind before, between or after PDM images.
PDM files have the
Each PDM image consists of the following:
PDM32magic number followed by a newline character (
#and extend to the next newline character. They may be ignored by the image reader.
0x20), the height formatted as ASCII characters in decimal and a newline character. The width and height of the image must be in the range
width * heightIEEE 754 single precision floating point numbers in row-major order. Each float must be stored in little-endian byte order.
The floating point values of zero, not-a-number (NaN) and negative infinity must all be considered as invalid or missing data. Positive infinity may be used to indicate a measurement that is too far away. This can be useful in cases where there's no actual measurement but it's known that there are no obstacles along a particular ray, e.g. in synthetic datasets or rays extending towards the sky.
Compression may be optionally provided by some external program such as
xz. The resulting file should have the appropriate extension
appended to its name, e.g.
foo.pdm would become
xz compression respectively.
It is recommended that any other data required to interpret the images, such as camera parameters, be included as human-readable data in the comment section.
The following table contains the image size comparison for the first depth
image in the dataset (
1305031453.374112.png). The PNG optimization was
|115 KiB||73 KiB||1.2 MiB||81 KiB||57 KiB||62 KiB|
The following table contains the image size comparison for all 595 depth images
in the dataset. One important thing to note is that when converted to PDM all
depth images are placed in the same file. This means that all 4 PDM versions of
the dataset consist of a single file instead of 595 individual image files. The
PNG optimization was again performed using
|70 MiB||44 MiB||697 MiB||48 MiB||33 MiB||37 MiB|
The image format design was based on the following goals:
It is common to distribute depth images as 16-bit grayscale PNG images. One downside with this approach is that the scaling factor used isn't contained in the image data. Users of the image have to search the dataset documentation to find the appropriate scaling factor to convert 16-bit unsigned integers into floating point values in meters.
There are also floating point image (e.g. PFM, TIFF, OpenEXR) formats but they
typically assume that values are always in the range
[0, 1] inclusive. Due to
this some of the libraries used to read or write these kinds of images will
clamp data to this range. It is possible to scale the data to fit in the
[0, 1] range but then the scaling factor is no longer clear as in the case of
16-bit grayscale images.
Adding compression would complicate the image format and require the use of a compression/decompression library. There are general purpose compression programs already installed in most systems that can be used for this purpose. Even though their compression ratio can be higher than image-specific compression methods, they are typically good enough as shown in the benchmarks.
Depth images may be produced by sensors with vastly different projection models. Depth cameras typically use a pinhole camera model, LIDARs use a spherical projection model and an orthographic projection might be used for a heightmap. Accounting for all the potential projection models would make the image format more complex.
The initial design of the PDM format contained a dedicated scale parameter so values other than meters could be stored. This has the added benefit that if you know your depth measurements are in a certain range you can retain higher precision. This wasn't deemed an important enough benefit considering the amount of precision already afforded by single precision floats in meters. Single precision floats have a precision of 6-7 decimal digits, thus even values of a few hundred meters have millimeter precision. This was deemed more than enough for the current sensors and applications. If this precision is deemed too little for certain applications a double precision floating point format can be introduced.
The textual header allows easy inspection of a file by humans. The binary row-major data allows direct reading and writing of image data since this is the format at which it's typically stored in memory.
The majority of systems where this format is expected to be used (x86, Linux on ARM) are little-endian. Requiring the data to be in little-endian order allows simplifying the file format by removing the need for a byte order indicator while still allowing the files to be portable on big-endian systems.
Having multiple images per file allows storing a dataset in only a few files. For example, if the PBM image format is used for color images, a full dataset could be a PDM file containing all depth images, a PBM file containing all corresponding color images and a text file containing the corresponding poses. Using a single file for all images allows better compression and faster data reading since opening files is a relatively slow operation.
The Portable Depth Map specification is licensed under the CC BY-ND 4.0 license. This allows freely sharing the specification with proper attribution but doesn't allow derivative specifications to prevent multiple mutually-incompatible standards. This license covers only the text of the specification. You are free to use PDM images and write code that manipulates PDM images under any license.