[ome-devel] Fwd: Bio-Formats/C++/Python

Curtis Rueden ctrueden at wisc.edu
Sat Oct 23 23:37:22 BST 2010


Hi everyone,

I don't know much about OMERO.blitz, but my understanding is that it is a
> way of doing inter-process integration cross-language and cross-machine.
> Conversely, Jace provides a way to do Java/C++ integration in-process.
> Hence, I am not sure what you mean by a "Jace-based wrapper via BLITZ."
>

I wanted to reply to my own comment above, because I found out what Michael
was talking about:
  http://www.xuvtools.org/doku.php?id=devel:libblitzbioformats

Looks like XuvTools created their own "BlitzBioFormats" library, which is
apparently just a coincidence and nothing to do with OMERO's "blitz server."

The page above is also the source of the statement that "a single 512×512 8
bit image from BioFormats to C++ takes about 60 sec."

I believe it was Mario Emmenlauer who wrote the BlitzBioFormats library.
Mario, did you see my prior comment (quoted below) about using JNI's
GetByteArrayRegion method? You can find a working example at:

http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-itk/itkBioFormatsImageIO.cxx?rev=7105
*

Is this what you meant by "extensions for fast memory access"?

*Also, I noticed your last report of success/failure says: "2010.07.11:
libBlitzBioFormats and BlitzBioFormatsViewer (r3881) Compile on MacOS
(64bit) but do not initialize Java VM correctly." Are you passing the
"-Djava.awt.headless=true" flag when creating the JVM from C++? We found
that it made usage of Java from C++ on Mac OS X much easier.

-Curtis

On Tue, Oct 5, 2010 at 5:13 PM, Curtis Rueden <ctrueden at wisc.edu> wrote:

> Hi Michael, Jason & others,
>
> The access to image formats via the Java-based bioformats has a big
>> performance issue when accessing from C++/Python (or any non-Java system).
>> People at Sybit/Switzerland tried there own Jace-based wrapper via BLITZ
>> (libBlitzBioFormats), but had to improve Jace first, since the import of
>> ONE! 512x512 image took ~60s.
>
>
> I don't know much about OMERO.blitz, but my understanding is that it is a
> way of doing inter-process integration cross-language and cross-machine.
> Conversely, Jace provides a way to do Java/C++ integration in-process.
> Hence, I am not sure what you mean by a "Jace-based wrapper via BLITZ."
>
> The solutions we currently provide are:
>
> 1) In-process: use bf-cpp, the Bio-Formats C++ bindings (which use Jace).
> 2) Inter-process: access Bio-Formats over Ice. Currently not maintained.
>
> I wrote a detailed web page discussing these options and more at:
>   http://www.loci.wisc.edu/bio-formats/interfacing-non-java-code
>
> Together with some pages on Codemesh's web site (particularly:
> http://codemesh.com/in_process.html, http://codemesh.com/technology.html),
> it is a good summary of the pros and cons of these various approaches.
>
> I spent a lot of time on bf-cpp and I can say with confidence that
> performance is fairly comparable to running in pure Java as long as you take
> care to avoid JNI calls per pixel.
>
> E.g., if you have a ByteArray (java byte[]) called buf, rather than
> accessing buf[i] in a for loop, use JNI's "GetByteArrayRegion" method:
>
> jbyte* jData = new jbyte[bytesPerPlane];
> JNIEnv* env = jace::helper::attach();
> jbyteArray jArray = static_cast<jbyteArray>(buf.getJavaJniArray());
> env->GetByteArrayRegion(jArray, 0, bytesPerPlane, jData);
>
>
> This copies the Java array into a data block in the C++ application's
> memory—the jData pointer can then be cast to whatever type you wish.
>
> This advice goes for whatever integration solution you use, be it
> in-process or inter-process: treat the communication layer between Java and
> non-Java as a bottleneck, and minimize method calls across that bridge.
>
> You can view some working examples at:
>
> http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-itk/itkBioFormatsImageIO.cxx
>
> http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/showinf.cpp
>
> http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/minimum_writer.cpp
>
> We use bf-cpp daily in WiscScan (LOCI's internal acquisition software) to
> produce OME-TIFF. We also use bf-cpp as part of our Bio-Formats/ITK/FARSIGHT
> integration, which is completely functional. And the V3D team uses bf-cpp
> for their V3D Bio-Formats plugin. So I think the bindings are reasonably
> useful and performant.
>
> CellProfiler has its own way by launching a Java VM from Python.
>
>
> This is true. As for why Lee didn't use the C++ bindings, my understanding
> is that he felt that bf-cpp was unnecessarily complex for what he was trying
> to do, and he wanted to avoid the dependency on boost-thread. However, as a
> consequence his solution is very limited in scope—e.g., it does not scale
> well to more complex API calls, and it may have problems in a multi-threaded
> application.
>
> The simple question behind this is: how to make the access to bioformats
>> simpler and faster, which is an issue for the growing Python community.
>
>
> Could you please clarify whether your colleagues were attempting to use
> bf-cpp, or some other solution? I will second Jason's suggestion that we
> work together, and add that clear communication is crucial. If you are
> having difficulty using Bio-Formats from Python or C++, let us know the
> details so that we can help troubleshoot, and improve the technology as
> needed.
>
> In the case of bf-cpp, I must apologize for the lack of documentation—I
> haven't fleshed out a dedicated web page for bf-cpp yet, other than the
> build instructions on the FARSIGHT wiki:
>
> http://www.farsight-toolkit.org/wiki/FARSIGHT_Tutorials/Building_Software/Bio-Formats/Building_C%2B%2B_Bindings
>
> I see two solutions for that.
>
> 1. Using the CellProfiler implementation as a standalone package.
>> Performance is unknown to me. Short term issue.
>
>
> I would caution against this approach. I don't think Lee intended the
> CellProfiler implementation to be used as an external library. And I have
> reservations about supporting two separate in-process solutions for
> Bio-Formats.
>
> That said, Lee did tell me the incantation needed to use the CP Bio-Formats
> module:
>   from cellprofiler.modules.loadimages import load_using_bioformats
>
>  2. I was wondering with Carolina Wählby from the Broad how much work it
>> really is to collect the most needed formats for HC/HT screening and rewrite
>> bioformats as a pure C++ library using the highly developed
>> libtiff/libpng/libjpeg while providing a Python interface.
>
>
> We have heard this sort of proposal before—e.g., from proponents of ITK and
> the BSD license—and it seems to stem from language and/or licensing
> preferences more than anything else. The reality is: if it's written in C++
> you need Java wrappers to call from Java programs, and vice versa. There is
> no way to escape it as long as this dichotomy between C++ and Java
> exists. And I must strongly caution that even if you did language translate
> portions of Bio-Formats to C++, you are unlikely to see a substantial
> performance benefit in either space or time—certainly not enough to justify
> the time needed for the effort.
>
> As for time needed, Bio-Formats has been approximately 10-15 man-years of
> work so far, and it reasonable to assume a language translation to C++ (even
> for only a subset of formats) would take at least a few man-years—and that
> doesn't address the subsequent issue of maintaining a forked codebase.
>
> For TIFF derivate formats (and there are many) this would be a simple job
>> and there a C++ libs out there solving the problem already.
>
>
> Beware that some commercial TIFF variants violate the TIFF standard, which
> can cause problems with libtiff. That said, libtiff is great and if someone
> did want to support all these formats from C++, using libtiff whenever
> possible would be the way to go.
>
> Another approach people have used successfully is invoking Java/Bio-Formats
> via system calls, and reading the results via stdout or from a file. This is
> the integration approach we used with the OME perl server, and it also
> worked very well.
>
> I guess CellProfiler has the same problem. Any opinions?
>
>
> My understanding is that Lee and Adam (the CP developers) have solved the
> Java integration issue from CP at this point, both for Bio-Formats and for
> ImageJ, and for all three major platforms (Windows, Mac OS X, Linux). There
> were issues with AWT calls from native code on Mac OS X, but they were
> resolved with a custom inter-process solution using sockets. So if there are
> any outstanding roadblocks there, I don't know of them.
>
> In conclusion, thanks for your feedback—I'm glad it made it to a public
> list. In the future, it would be great to keep an open line of
> communication, so that we can help to solve your problems and improve the
> quality of the software.
>
> -Curtis
>
> On Fri, Oct 1, 2010 at 11:45 AM, Jason Swedlow <jason at lifesci.dundee.ac.uk
> > wrote:
>
>> Dear All-
>>
>> Michael Held (ETH Zurich) wrote this comment about Python implementation
>> of Bio-Formats.  Any comments, feedback, or other similar experiences out
>> there?
>>
>> In general, our own preference would be to ensure there is a **single**
>> resource for file format translation.  Maintaining more than one just
>> duplicates effort on something that is very difficult and tedious, even at
>> the best of times.  While I do agree that making a Python HCS-only reader is
>> possible, and probably not terribly hard, it's the maintenance and updates
>> of this resource that is really time consuming in the end.  Until the
>> various vendors coalesece around a single standard, that is just true.  So,
>> if possible, let's reuse as much as we have, and work together on a
>> **single** resource, whatever it is.
>>
>> Regarding the timings quoted, I'll let Curtis respond, but something
>> sounds pretty wrong.
>>
>> Comments??
>>
>> Cheers,
>>
>> Jason
>>
>>
>>
>>
>> Begin forwarded message:
>>
>> - The access to image formats via the Java-based bioformats has a big
>> performance issue when accessing from C++/Python (or any non-Java system).
>> People at Sybit/Switzerland tried there own Jace-based wrapper via BLITZ
>> (libBlitzBioFormats), but had to improve Jace first, since the import of
>> ONE! 512x512 image took ~60s. CellProfiler has its own way by launching a
>> Java VM from Python.
>> The simple question behind this is: how to make the access to bioformats
>> simpler and faster, which is an issue for the growing Python community.
>> I see two solutions for that.
>> 1. Using the CellProfiler implementation as a standalone package.
>> Performance is unknown to me. Short term issue.
>> 2. I was wondering with Carolina Wählby from the Broad how much work it
>> really is to collect the most needed formats for HC/HT screening and rewrite
>> bioformats as a pure C++ library using the highly developed
>> libtiff/libpng/libjpeg while providing a Python interface. For TIFF derivate
>> formats (and there are many) this would be a simple job and there a C++ libs
>> out there solving the problem already.
>> I guess CellProfiler has the same problem. Any opinions?
>>
>>
>>
>>
>>  **************************
>> Wellcome Trust Centre for Gene Regulation & Expression
>> College of Life Sciences
>> MSI/WTB/JBC Complex
>> University of Dundee
>> Dow Street
>> Dundee  DD1 5EH
>> United Kingdom
>>
>> phone (01382) 385819
>> Intl phone:  44 1382 385819
>> FAX   (01382) 388072
>>  email: jason at lifesci.dundee.ac.uk
>>
>>  Lab Page: http://gre.lifesci.dundee.ac.uk/staff/jason_swedlow.html
>> Open Microscopy Environment: http://openmicroscopy.org
>> **************************
>>
>> The University of Dundee is a Scottish Registered Charity, No. SC015096.
>>
>>
>>
>>
>> _______________________________________________
>> ome-devel mailing list
>> ome-devel at lists.openmicroscopy.org.uk
>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20101024/4169257a/attachment-0001.html>


More information about the ome-devel mailing list