Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add INTERLEAVE=BAND option to COG driver #10859

Open
jakimowb opened this issue Sep 22, 2024 · 5 comments
Open

Add INTERLEAVE=BAND option to COG driver #10859

jakimowb opened this issue Sep 22, 2024 · 5 comments

Comments

@jakimowb
Copy link

Feature description

Request

It is requested to enhance the COG driver by an option to create COG rasters with band interleave, i.e. an creation option INTERLEAVE=BAND, similar those existing in the GTiff driver.

Problem

According to the OGC COG specification, COGs with multiple bands may be stored with pixel interleave (BIP) or band interleave (BSQ).

When more than one component is encoded in a TIFF file, the TIFF provides two possibilities.

  1. The component values for each pixel are stored contiguously.
    This is marked in the file as PlanarConfiguration=Contig (a.k.a. Chunky format, value 1).
    This is a common arrangement for RGB combinations of bands. The data is stored as
    RGBRGBRGB… (this arrangement is also known as Band Interleaved by Pixel, BIP).

  2. The components are stored in separate component planes. This is marked in the file
    as PlanarConfiguration=Separate (a.k.a. Planar format, value 2). This is the common
    arrangement for the bands of a multispectral image (this arrangement is also known
    as Band Sequential, BSQ).

By now (GDAL 3.9.2), the COG driver supports BIP only. Unfortunatelly, COGs generated in this way for multiband datsets, e.g. hyperspectral satellite data with more than 100 bands, cannot be visualized smoothly. The loading times for block-by-block band information, e.g. to visualize them in QGIS, are simply too long.

Workaround

A current workaround to create BSQ COGs for such kind of raster data is to:

  1. use the GTiff driver and create a raster with TILED=YES and INTERLEAVE=BAND
  2. run gdaladdo to create overview images
  3. run cogger to create the COG images
    (these images validate agains osgeo_utils.samples.validate_cloud_optimized_geotiff)

How to reproduce

  1. Start QGIS
  2. Open Datasource Manager -> Raster -> Protocol
  3. add url to this BSQ COG (created with GDAL + cogger) ENMAP01-____L2A-DT0000001867_20220724T104526Z_008_V010302_20230628T165614Z-SPECTRAL_IMAGE.BSQ_COG.tiff. This BSQ COG was created with the workaround described above. However, it would be nice if GDAL could create these BSQ COGs itself.
  4. zoom to native resolution and apply this *.qml style ENMAP01-____L2A-DT0000001867_20220724T104526Z_008_V010302_20230628T165614Z-SPECTRAL_IMAGE.BSQ_COG.qml to optimize visualization
  5. zoom and pan around to get a feeling for the loading times.
  6. Now repeat steps 3-6 with the BIP COG ENMAP01-____L2A-DT0000001867_20220724T104526Z_008_V010302_20230628T165614Z-SPECTRAL_IMAGE.BIP_COG.tiff. This image was created with the GDAL COG driver (GDAL 3.9.2.).

grafik

Additional context

No response

@jakimowb jakimowb changed the title Add INTERLEAVE=BAND to COG driver Add INTERLEAVE=BAND option to COG driver Sep 22, 2024
@rouault
Copy link
Member

rouault commented Sep 22, 2024

One thing that is not entirely clear to me for INTERLEAVE=BAND in a cloud optimized context is how to place tiles in the file
I can see 2 alternatives:

  • first all tiles of first band, then all tiles of second band, etc . That's what INTERLEAVE=BAND for the GTiff driver would do
  • or tile (0,0) of first band, tile (0, 0) of second band, ... , tile(0, 0) of last band, tile (tilex=1, tiley=0) of first band, ..., tile (tilex=1, tiley=0) of last band, etc.

Both could make sense, although I suppose first one would be slightly preferred and closer to what BSQ is.
The second alternative would be a kind of BIL layout adapted for tiling.

@tbonfort
Copy link
Member

Both options are valid choices, and selecting one rather than the other heavily depends on the predicted/prevalent access scenarios. I would recommend making this choice user selectable via a configuration option.

@rouault
Copy link
Member

rouault commented Sep 22, 2024

@cholmes @vincentsarago @joanma747 Do you expect some "disruption" of the COG ecosystem if GDAL would start writing multi-band COG files with INTERLEAVE=BAND ? (although that wouldn't be the default)

@jakimowb
Copy link
Author

Would be interesting to see what this means in terms of average loading times for the following use-case:

  1. classic map visualization on different zoom levels, i.e. a small number of bands required for a single-band or 3-band combination.
  2. extraction of single pixel profiles (same pixel coordinate but all band values).

@rouault
Copy link
Member

rouault commented Sep 22, 2024

  1. classic map visualization on different zoom levels, i.e. a small number of bands required for a single-band or 3-band combination.

Let's imagine that your request intersects 2 tiles in the horizontal direction (thus consecutive)

  • if you select 5 consecutive bands,
    • if using BSQ layout, then you need 5 HTTP GET requests, each requesting 2 tiles at a time
    • if using "pseudo-BIL" layout, then you need 2 HTTP GET requests, each requesting 5 bands at a time
  • if you select 5 non-consecutive bands,
    • if using BSQ layout, then you need 5 HTTP GET requests, each requesting 2 tiles at a time
    • if using "pseudo-BIL" layout, then you need 10 HTTP GET requests

Let's imagine that your request intersects 2 tiles in the vertical direction (thus non-consecutive)

  • if you select 5 consecutive bands,
    • if using BSQ layout, then you need 10 HTTP GET requests
    • if using "pseudo-BIL" layout, then you need 2 HTTP GET requests, each requesting 5 bands at a time
  • if you select 5 non-consecutive bands,
    • if using BSQ layout, then you need 10 HTTP GET requests
    • if using "pseudo-BIL" layout, then you need 10 HTTP GET requests
  1. extraction of single pixel profiles (same pixel coordinate but all band values).
  • if using BSQ layout, then you need as many HTTP GET requests as they are bands
  • if using "pseudo-BIL" layout, you need one single HTTP GET request
    In both cases, you will need to download the whole blocks for compressed data (for uncompressed data, you could in theory issue a multi-range GET request, but very few servers support that, in particular commercial cloud object storage services don't)

(all the above assumes a "smart" reader. I'm not totally sure that the GDAL driver has all those read optimizations in the multi-band case)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants