[ome-users] OME-TIFF: problem with the "micron" character (micrometer unit)

Christoph Gohlke cgohlke at uci.edu
Wed Sep 6 18:16:41 BST 2017


Thank you. `set JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF-8"` before 
running tiffcomment worked for me.

For redirecting Python UTF8 output, the equivalent is `set 
PYTHONIOENCODING=UTF-8`.

Christoph


On 9/6/2017 6:42 AM, Curtis Rueden wrote:
> Hi Roger,
> 
>  > The problem lies with the behaviour of Java on Windows.
> 
> Nice detective work. :-)
> 
> I am curious: do you know whether it works with:
> 
> * The PowerShell console
> * A Cygwin terminal
> * An MSYS (e.g., Git Bash) terminal
> 
> ?
> 
> Regards,
> Curtis
> 
> --
> Curtis Rueden
> LOCI software architect - https://loci.wisc.edu/software
> ImageJ2 lead, Fiji maintainer - https://imagej.net/User:Rueden
> 
> 
> On Wed, Sep 6, 2017 at 7:57 AM, Roger Leigh <rleigh at dundee.ac.uk 
> <mailto:rleigh at dundee.ac.uk>> wrote:
> 
>     On 01/09/17 20:30, Christoph Gohlke wrote:
> 
>         one issue is that the tiffcomment utility outputs XML that is
>         not well
>         formed. OME-XML should be UTF-8 encoded, but tiffcomment apparently
>         encodes with latin1, iso-8859-1, or similar (Bioformats 5.6.0 on
>         Windows
>         10).
>         Try re-encoding the XML file (e.g. in Python3 Q&D):
> 
>         xml = open('comment.xml', 'rb').read()
>         xml = xml.decode('iso-8859-1').encode('utf8')
>         open('comment.xml', 'wb').write(xml)
> 
>         Another issue could be that the XML in the ome.tiff file is not
>         encoded
>         correctly. Open the ome.tiff file with a HEX editor. The lower
>         case Mu
>         letter should be stored in two bytes (C2 B5), not just one byte
>         (B5).
> 
> 
>     The problem lies with the behaviour of Java on Windows.
> 
>     tiffcomment uses System.out.println() to print the comment to standard
>     output, and this uses the default encoding.  On Windows, this is likely
>     to be an old 8-bit codepage such as CP1252, which will result in the
>     output being recoded from UTF-8 to whatever codepage is in use.  Please
>     see
>     https://stackoverflow.com/questions/24803733/default-character-encoding-for-java-console-output
>     <https://stackoverflow.com/questions/24803733/default-character-encoding-for-java-console-output>
>     for further details.
> 
>     You could try to force the use of UTF-8 by making this change to the
>     bf.bat script which is part of bftools:
> 
> 
>     diff --git a/tools/bf.bat b/tools/bf.bat
>     index 0c56b79388..6f3146e956 100644
>     --- a/tools/bf.bat
>     +++ b/tools/bf.bat
>     @@ -22,6 +22,14 @@ if "%BF_MAX_MEM%" == "" (
>       )
>       set BF_FLAGS=%BF_FLAGS% -Xmx%BF_MAX_MEM%
> 
>     +rem Set the file encoding
>     +if "%BF_ENCODING%" == "" (
>     +  rem Set UTF-8 by default
>     +  set BF_ENCODING=UTF-8
>     +)
>     +set "BF_FLAGS=%BF_FLAGS% -Dfile.encoding=%BF_ENCODING%"
>     +
>     +
>       rem Skip the update check if the NO_UPDATE_CHECK flag is set.
>       if not "%NO_UPDATE_CHECK%" == "" (
>         set BF_FLAGS=%BF_FLAGS% -Dbioformats_can_do_upgrade_check=false
> 
>     It's not something which we can enable by default, because this is not a
>     setting which is supposed to be used publicly, but it may help in
>     this case.
> 
>     An alternative solution would be to use a Unix platform such as Linux,
>     FreeBSD or MacOS X with a UTF-8 locale, where the output will always be
>     correctly encoded as UTF-8.
> 
>     As a better long term solution, we could reopen System.out to use a
>     UTF-8 encoding, or to use raw bytes and transfer everything verbatim.
> 
> 
>     Kind regards,
>     Roger
> 
>     --
>     Dr Roger Leigh -- Open Microscopy Environment
>     Wellcome Trust Centre for Gene Regulation and Expression,
>     College of Life Sciences, University of Dundee, Dow Street,
>     Dundee DD1 5EH Scotland UK   Tel: (01382) 386364
> 
>     The University of Dundee is a registered Scottish Charity, No: SC015096
>     _______________________________________________
>     ome-users mailing list
>     ome-users at lists.openmicroscopy.org.uk
>     <mailto:ome-users at lists.openmicroscopy.org.uk>
>     http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
>     <http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users>
> 
> 
> 
> 
> _______________________________________________
> ome-users mailing list
> ome-users at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
> 


More information about the ome-users mailing list