[ome-devel] TIFF pyramid support in Bio-Formats - reference files for review

Damir Sudar dsudar at lbl.gov
Fri Mar 23 01:40:24 GMT 2018


Hi Roger,

This is extremely useful and indeed will allow creating test images and 
examples at will to push through any of the upcoming 
Bio-Formats/OMEFiles  tools and library. At first glance the example 
files you posted nicely follow the proposed specifications and would 
indeed align nicely with my needs. I'll spend a bit more time looking 
through all this but here some quick comments:
- wow, that's some impressive bash scripting on the gnarly scripts to 
create the outputs from the source files!!
- looking at the scripts, I immediately tried using and adapting 
makepyramid-svs to handle .svs files and output from the ImageMagick 
ptif writer so I can readily generate my own very simple example files 
(and understand better how your scripts work). The changes for ptif are 
quite simple but I'm running into an error with both svs and ptif files 
with the command: "tiffset -d 0 -s 330 $nsubifds $subifds_diroffs 
"$dest"". It fails with:
TIFFWriteDirectoryTagSubifd: Illegal value for SubIFD tag.
TIFFWriteDirectoryTagSubifd: Illegal value for SubIFD tag.
I'm using libtiff 4.0.9 and its associated tools on Ubuntu 14.04. The 
error does not happen with the scn files provided. Could it be related 
to having an offset value that is too large for a non-BigTIFF file? See: 
https://gitlab.com/libtiff/libtiff/blob/master/libtiff/tif_dirwrite.c 
Any ideas? I'm enclosing my modified script for ptif and I was using the 
openslide generic pyramid TIFF 
(http://openslide.cs.cmu.edu/download/openslide-testdata/Generic-TIFF/) 
as an example.
- the Leica-Fluorescence-1.scn (and thus all its result files) has the 
channels stored in a way that is not exactly conform how one would 
normally store the channels in the OME-standard way. Am I correct in 
thinking that those channels would normally be in separate top-level 
IFDs each and have their sub-resolutions arranged in SubIFDs under each 
of those top-levels?
- also, all these examples carry some of those ancillary images such as 
slide labels, overview images, etc. While the scripts handled those 
correctly as far as I can see, a reader would have to be pretty smart to 
figure out which of the contents of such an output file is the actual 
image and which are those ancillary images.

Thanks,
- Damir

On 3/21/2018 10:18, Roger Leigh wrote:
> This should primarily be of interest to Damir Sudar, and anyone else
> following the TIFF and OME-TIFF sub-resolution support proposal
> (http://openmicroscopy.github.io/design/OME005/).
>
> I have created a set of TIFF and OME-TIFF files containing pyramids
> stored as SUBIFDs, which can be downloaded from Box using this link:
> https://uod.box.com/s/shoodcr8j1zem3atzcql3c69cc8xrm5x
>
> These files implement strategies "B" and "C" in the above proposal
> document.  Taking publicly available sample datasets using existing
> pyramid formats (Leica SCN and NDPI), we have tweaked them to store the
> pyramid data in SUBIFDS.  Files with a "subifds.tiff" suffix implement
> strategy "C" (data in isolated SUBIFDs), while files with a
> "subifds-flat.tiff" implement strategy "B" (SUBIFDs linking back to IFDs
> in the main directory list).  Files with an "ome.tiff" extension are
> exactly the same, but contain OME-XML metadata with TiffData IFD indexes
> pointing to the top-level IFDs containing the SUBIFDS.
>
> I will be using these files for validating the Bio-Formats and OME-Files
> TIFF and OME-TIFF readers as I add support for pyramids to them.
> Likewise for the writers as we add support to bfconvert and can use them
> as source material.  If you want to take a look at the sample file
> structure and see if this is aligned with your needs, matches the
> design, and is generally sensible, that would certainly be appreciated.
> If there are any faults either with the design or the reference files,
> we can certainly amend them.  You can follow up here on ome-devel, or
> open a github issue on https://github.com/openmicroscopy/design
>
> I'd recommend the Leica samples over the NDPI samples, since the tiled
> JPEG pixel data is easier to view.  You can view the file structure with
> a tool such as libtiff's "tiffinfo", which will show the Directory
> entries plus SubIFDs.  You can look at the SubIFD entries with the "-o"
> option with a SubIFD offset.  Viewing is possible with the libtiff
> "tiffgt" tool, again with "-o" and the appropriate IFD offset (though
> you might be constrained by memory and only smaller pyramid levels will
> be viewable).
>
> Note that the sample files won't actually work properly with Bio-Formats
> or OMERO until we have added pyramid support to the readers.  Some fail
> to read properly, or work but are missing the resolution levels. This
> will be rectified soon (I'm working on reader support right now).
> Strategy "B" files in particular don't work nicely with the existing
> MinimalTiffReader, so Strategy "C" continues to be preferable.
>
>
> Kind regards,
> Roger
>
> -- 
> Dr Roger Leigh -- Open Microscopy Environment
> Wellcome Trust Centre for Gene Regulation and Expression,
> College of Life Sciences, University of Dundee, Dow Street,
> Dundee DD1 5EH Scotland UK   Tel: (01382) 386364
>
> The University of Dundee is a registered Scottish Charity, No: SC015096

-- 
Damir Sudar - Affiliate Scientist
Lawrence Berkeley Natl Laboratory / MBIB
One Cyclotron Road, MS 977, Berkeley, CA 94720, USA
T: 510/486-5346 - F: 510/486-5586 - E: DSudar at lbl.gov
http://biosciences.lbl.gov/profiles/damir-sudar-2/

Visiting Scientist, Oregon Health & Science University

-------------- next part --------------
#!/bin/bash

set -e
set -x

# Check if file is BigTIFF
# $1 file
bigtiff() {
    if [ "0000000 4949 002b" != "$(dd if="$1" iflag=count_bytes count=4 | od -x | head -n1)" ]; then
        return 1
    fi
    return 0
}

# Get fieldoffset for TIFF
# $1 file
fieldoffset() {
    if bigtiff "$1"; then
        echo 8
    else
        echo 2
    fi
}

# Get fieldsize for TIFF
# $1 file
fieldsize() {
    if bigtiff "$1"; then
        echo 20
    else
        echo 12
    fi
}

# Get IFD offsets
# $1=IFD number
# $2=file
diroffsets() {
    tiffinfo "$dest" | grep "TIFF Directory at offset" | sed -e 's;.*(\(.*\))$;\1;'
}

# Get offset for IFD
# $1=IFD number
# $2=file
diroffset() {
    diroffsets "$2" | head -n$(($1 + 1)) | tail -n1
}

# Get number of tags in directory
# $1=IFD offset
# $2=file
ntags() {
    echo "od -j $1 -N 2 -d \"$2\"" >&2
  od -j $1 -N 2 -d "$2" | head -n1 | sed -e 's;^[0-9]* *\(.*\);\1;'
}

# Offset of next pointer in IFD
# $1=IFD offset
# $2=file
nextoffset() {
    echo "$(($1 + $(fieldoffset "$2") + ($(ntags $1 "$2") * $(fieldsize "$2"))))"
}

# Write uint64 little endian value to binary file
# $1=value
# $2=destination file
# $3=offset in file
update_uint64_le() {
    if bigtiff "$2"; then
        printf "$(printf %.16x $1 | sed -e 's;\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\);\8\7\6\5\4\3\2\1;' | sed -e 's;\(..\);\\x\1;g')" | dd of="$2" conv=notrunc,nocreat oflag=seek_bytes seek=$3
    else
        printf "$(printf %.8x $1 | sed -e 's;\(..\)\(..\)\(..\)\(..\);\4\3\2\1;' | sed -e 's;\(..\);\\x\1;g')" | dd of="$2" conv=notrunc,nocreat oflag=seek_bytes seek=$3
    fi
}

makeuuid() {
    echo "urn:uuid:$(uuidgen)"
}

src="$(ls -1 orig/*.ptif)"
mkdir -p new

for srcfile in $src; do
    base="$(basename $srcfile)"
    dest="new/${base%.ptif}-subifds.tiff"

    cp "$srcfile" "$dest"

    nifds=$(tiffinfo "$dest" | grep "TIFF Directory at offset" | wc -l)
    echo "IFD count: $nifds"

    ifds=$(seq 0 $(($nifds - 1)))
    mainifd=$(echo "$ifds" | head -n1)
#    otherifds=$(echo "$ifds" | tail -n2)
    otherifds=''
    subifds=$(echo "$ifds" | tail -n$((nifds-1)))
    nsubifds=$(echo "$subifds" | wc -l)
    echo "IFDs: $ifds"
    echo "Main IFDs: $mainifd"
    echo "Other IFDs: $otherifds"
    echo "SUBIFDs: $subifds"

    # Main header
    tiffset -d 0 -s 270 "OME Pyramid TIFF test (from $base)" "$dest"
    tiffset -d 0 -s 305 "A gnarly shell script (makepyramid-svs)" "$dest"
    tiffset -d 0 -s 315 "Roger Leigh <rleigh at dundee.ac.uk>" "$dest"

    # NewSubFileType
    for ifd in $mainifd $otherifds; do
        tiffset -d $ifd -s 254 0 "$dest"
    done
    for ifd in $subifds; do
        tiffset -d $ifd -s 254 1 "$dest"
    done

    # SubIFDs
    subifds_diroffs=$(echo $(tiffinfo "$dest" | grep "TIFF Directory at offset" | sed -e 's;.*(\(.*\))$;\1;' | tail -n$((nifds-1))))
    echo "SubIFDs for series 0: $subifds_diroffs"
#    tiffset -d 0 -s 330 $nsubifds $subifds_diroffs "$dest"
    subifds_diroffs=$(echo $(tiffinfo "$dest" | grep "TIFF Directory at offset" | sed -e 's;.*(\(.*\))$;\1;' | tail -n$((nifds-1))))
    echo "Updated SubIFDs for series 0: $subifds_diroffs"

    echo "New directories:"
    diroffsets "$dest"

    dest2="${dest%.tiff}-flat.tiff"
    cp "$dest" "$dest2"

    # Relink IFDs to elide SubIFDs
    series0dir="$(diroffset 0 "$dest")"
    echo "s0dir $series0dir"
#    seriesndir="$(diroffset $(echo "$otherifds" | head -n1) "$dest")"
# this still causes a bug: the subresolutions end up in their own toplevel IFD
    seriesndir="$(diroffset $(echo "$subifds" | head -n1) "$dest")"
    echo "sndir $seriesndir"

    noffset="$(nextoffset $series0dir "$dest")"
    echo "UPDATE: $series0dir $noffset -> [$seriesndir]"
    update_uint64_le $seriesndir "$dest" $noffset

    for offset in $subifds_diroffs; do
        c="$(ntags $offset "$dest")"
        echo "NTAGS: $c"
        noffset="$(nextoffset $offset "$dest")"
        echo "UPDATE: $offset $noffset -> 0"
        update_uint64_le 0 "$dest" $noffset
    done

    tiffinfo "$dest"

    # Create OME-XML metadata for the files.
    dest_ometiff="${dest%.tiff}.ome.tiff"
    dest2_ometiff="${dest2%.tiff}.ome.tiff"
    cp "$dest" "$dest_ometiff"
    cp "$dest2" "$dest2_ometiff"

    bfomexml="$(showinf -nopix -noflat -omexml -omexml-only "$srcfile")"

    # Add TiffData elements.
    uuid="$(makeuuid)"
    ome_attr="Creator=\"makepyramid-svs\" UUID=\"${uuid}\""
    tiffdata_fmt="<TiffData FirstC=\"0\" FirstT=\"0\" FirstZ=\"0\" IFD=\"%d\" PlaneCount=\"1\"><UUID FileName=\"$(basename "${dest_ometiff}")\">${uuid}</UUID></TiffData>"
    tiffdata_fmt_flat="<TiffData FirstC=\"0\" FirstT=\"0\" FirstZ=\"0\" IFD=\"%d\" PlaneCount=\"1\"><UUID FileName=\"$(basename "${dest2_ometiff}")\">${uuid}</UUID></TiffData>"

    omexml_fmt="$(echo "$bfomexml" | sed -e "s;\(<OME.*\)\(\">\);\1\" ${ome_attr}>;" -e "s;<MetadataOnly\/>;${tiffdata_fmt};")"
    omexml_fmt_flat="$(echo "$bfomexml" | sed -e "s;\(<OME.*\)\(\">\);\1\" ${ome_attr}>;" -e "s;<MetadataOnly\/>;${tiffdata_fmt_flat};")"

    ifds="0 1 2"
    omexml="$(printf "$omexml_fmt" $ifds)"

    flatifds="$(echo $mainifd $otherifds)"
    omexml_flat="$(printf "$omexml_fmt" $flatifds)"

    tiffset -d 0 -s 270 "$omexml" "$dest_ometiff"
    tiffset -d 0 -s 270 "$omexml_flat" "$dest2_ometiff"
done


More information about the ome-devel mailing list