Hi Michael, Jason & others,<div><br></div><div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
The access to image formats via the Java-based bioformats has a big performance issue when accessing from C++/Python (or any non-Java system). People at Sybit/Switzerland tried there own Jace-based wrapper via BLITZ (libBlitzBioFormats), but had to improve Jace first, since the import of ONE! 512x512 image took ~60s.</blockquote>
<div><br></div><div><div>I don't know much about OMERO.blitz, but my understanding is that it is a way of doing inter-process integration cross-language and cross-machine. Conversely, Jace provides a way to do Java/C++ integration in-process. Hence, I am not sure what you mean by a "Jace-based wrapper via BLITZ."</div>
</div><div><br></div><div><div>The solutions we currently provide are:</div><div><br></div><div>1) In-process: use bf-cpp, the Bio-Formats C++ bindings (which use Jace).</div><div>2) Inter-process: access Bio-Formats over Ice. Currently not maintained.</div>
<div><br></div><div>I wrote a detailed web page discussing these options and more at:</div><div> <a href="http://www.loci.wisc.edu/bio-formats/interfacing-non-java-code">http://www.loci.wisc.edu/bio-formats/interfacing-non-java-code</a></div>
<div><br></div><div>Together with some pages on Codemesh's web site (particularly: <a href="http://codemesh.com/in_process.html">http://codemesh.com/in_process.html</a>, <a href="http://codemesh.com/technology.html">http://codemesh.com/technology.html</a>), it is a good summary of the pros and cons of these various approaches.</div>
<div><br></div><div>I spent a lot of time on bf-cpp and I can say with confidence that performance is fairly comparable to running in pure Java as long as you take care to avoid JNI calls per pixel.</div><div><br></div><div>
E.g., if you have a ByteArray (java byte[]) called buf, rather than accessing buf[i] in a for loop, use JNI's "GetByteArrayRegion" method:</div><div><br></div></div></div><blockquote class="webkit-indent-blockquote" style="margin: 0 0 0 40px; border: none; padding: 0px;">
<div><div><div><div><div>jbyte* jData = new jbyte[bytesPerPlane];</div></div><div>JNIEnv* env = jace::helper::attach();</div></div></div></div><div><div><div><div>jbyteArray jArray = static_cast<jbyteArray>(buf.getJavaJniArray());</div>
</div></div></div><div><div><div><div>env->GetByteArrayRegion(jArray, 0, bytesPerPlane, jData);</div></div></div></div></blockquote><div><div><div><br></div><div>This copies the Java array into a data block in the C++ application's memory—the jData pointer can then be cast to whatever type you wish.</div>
<div><br></div><div>This advice goes for whatever integration solution you use, be it in-process or inter-process: treat the communication layer between Java and non-Java as a bottleneck, and minimize method calls across that bridge.</div>
<div><br></div><div>You can view some working examples at:</div><div> <a href="http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-itk/itkBioFormatsImageIO.cxx">http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-itk/itkBioFormatsImageIO.cxx</a></div>
<div> <a href="http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/showinf.cpp">http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/showinf.cpp</a></div>
<div>
<a href="http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/minimum_writer.cpp">http://dev.loci.wisc.edu/trac/software/browser/trunk/components/native/bf-cpp/source/minimum_writer.cpp</a></div>
<div><br></div><div>We use bf-cpp daily in WiscScan (LOCI's internal acquisition software) to produce OME-TIFF. We also use bf-cpp as part of our Bio-Formats/ITK/FARSIGHT integration, which is completely functional. And the V3D team uses bf-cpp for their V3D Bio-Formats plugin. So I think the bindings are reasonably useful and performant.</div>
</div><div><br></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
CellProfiler has its own way by launching a Java VM from Python. </blockquote><div><br></div><div><div>This is true. As for why Lee didn't use the C++ bindings, my understanding is that he felt that bf-cpp was unnecessarily complex for what he was trying to do, and he wanted to avoid the dependency on boost-thread. However, as a consequence his solution is very limited in scope—e.g., it does not scale well to more complex API calls, and it may have problems in a multi-threaded application.</div>
</div><div><br></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
The simple question behind this is: how to make the access to bioformats simpler and faster, which is an issue for the growing Python community.</blockquote><div><br></div><div><div>Could you please clarify whether your colleagues were attempting to use bf-cpp, or some other solution? I will second Jason's suggestion that we work together, and add that clear communication is crucial. If you are having difficulty using Bio-Formats from Python or C++, let us know the details so that we can help troubleshoot, and improve the technology as needed.</div>
<div><br></div><div>In the case of bf-cpp, I must apologize for the lack of documentation—I haven't fleshed out a dedicated web page for bf-cpp yet, other than the build instructions on the FARSIGHT wiki:</div><div> <a href="http://www.farsight-toolkit.org/wiki/FARSIGHT_Tutorials/Building_Software/Bio-Formats/Building_C%2B%2B_Bindings">http://www.farsight-toolkit.org/wiki/FARSIGHT_Tutorials/Building_Software/Bio-Formats/Building_C%2B%2B_Bindings</a></div>
</div><div><br></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
I see two solutions for that.</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
1. Using the CellProfiler implementation as a standalone package. Performance is unknown to me. Short term issue.</blockquote><div><br></div><div><div>I would caution against this approach. I don't think Lee intended the CellProfiler implementation to be used as an external library. And I have reservations about supporting two separate in-process solutions for Bio-Formats.</div>
</div><div><br></div><div>That said, Lee did tell me the incantation needed to use the CP Bio-Formats module:</div><div><div> from cellprofiler.modules.loadimages import load_using_bioformats</div><div><br></div></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
2. I was wondering with Carolina Wählby from the Broad how much work it really is to collect the most needed formats for HC/HT screening and rewrite bioformats as a pure C++ library using the highly developed libtiff/libpng/libjpeg while providing a Python interface.</blockquote>
<div><br></div><div>We have heard this sort of proposal before—e.g., from proponents of ITK and the BSD license—and it seems to stem from language and/or licensing preferences more than anything else. The reality is: if it's written in C++ you need Java wrappers to call from Java programs, and vice versa. There is no way to escape it as long as this dichotomy between C++ and Java exists. And I must strongly caution that even if you did language translate portions of Bio-Formats to C++, you are unlikely to see a substantial performance benefit in either space or time—certainly not enough to justify the time needed for the effort.</div>
<div><br></div><div>As for time needed, Bio-Formats has been approximately 10-15 man-years of work so far, and it reasonable to assume a language translation to C++ (even for only a subset of formats) would take at least a few man-years—and that doesn't address the subsequent issue of maintaining a forked codebase.</div>
<div><br></div><div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
For TIFF derivate formats (and there are many) this would be a simple job and there a C++ libs out there solving the problem already. </blockquote><div><br></div><div>Beware that some commercial TIFF variants violate the TIFF standard, which can cause problems with libtiff. That said, libtiff is great and if someone did want to support all these formats from C++, using libtiff whenever possible would be the way to go.</div>
<div></div></div><div><br></div><div><div><div>Another approach people have used successfully is invoking Java/Bio-Formats via system calls, and reading the results via stdout or from a file. This is the integration approach we used with the OME perl server, and it also worked very well.</div>
</div><div><br></div></div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
I guess CellProfiler has the same problem. Any opinions?</blockquote><div><br></div><div><div>My understanding is that Lee and Adam (the CP developers) have solved the Java integration issue from CP at this point, both for Bio-Formats and for ImageJ, and for all three major platforms (Windows, Mac OS X, Linux). There were issues with AWT calls from native code on Mac OS X, but they were resolved with a custom inter-process solution using sockets. So if there are any outstanding roadblocks there, I don't know of them.</div>
</div><div><br></div><div><div>In conclusion, thanks for your feedback—I'm glad it made it to a public list. In the future, it would be great to keep an open line of communication, so that we can help to solve your problems and improve the quality of the software.</div>
</div><div><br></div><div>-Curtis</div><div><br><div class="gmail_quote">On Fri, Oct 1, 2010 at 11:45 AM, Jason Swedlow <span dir="ltr"><<a href="mailto:jason@lifesci.dundee.ac.uk">jason@lifesci.dundee.ac.uk</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div style="word-wrap:break-word">Dear All-<div><br></div><div>Michael Held (ETH Zurich) wrote this comment about Python implementation of Bio-Formats. Any comments, feedback, or other similar experiences out there?</div>
<div><br></div><div>In general, our own preference would be to ensure there is a **single** resource for file format translation. Maintaining more than one just duplicates effort on something that is very difficult and tedious, even at the best of times. While I do agree that making a Python HCS-only reader is possible, and probably not terribly hard, it's the maintenance and updates of this resource that is really time consuming in the end. Until the various vendors coalesece around a single standard, that is just true. So, if possible, let's reuse as much as we have, and work together on a **single** resource, whatever it is.</div>
<div><br></div><div>Regarding the timings quoted, I'll let Curtis respond, but something sounds pretty wrong.</div><div><br></div><div>Comments??</div><div><br></div><div>Cheers,</div><div><br></div><div>Jason</div><div>
<br></div><div><br></div><div><br><div><br><div>Begin forwarded message:</div><br><blockquote type="cite"><span style="border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">- The access to image formats via the Java-based bioformats has a big performance issue when accessing from C++/Python (or any non-Java system). People at Sybit/Switzerland tried there own Jace-based wrapper via BLITZ (libBlitzBioFormats), but had to improve Jace first, since the import of ONE! 512x512 image took ~60s. CellProfiler has its own way by launching a Java VM from Python.<span> </span><br>
The simple question behind this is: how to make the access to bioformats simpler and faster, which is an issue for the growing Python community.<br>I see two solutions for that.<br>1. Using the CellProfiler implementation as a standalone package. Performance is unknown to me. Short term issue.<br>
2. I was wondering with Carolina Wählby from the Broad how much work it really is to collect the most needed formats for HC/HT screening and rewrite bioformats as a pure C++ library using the highly developed libtiff/libpng/libjpeg while providing a Python interface. For TIFF derivate formats (and there are many) this would be a simple job and there a C++ libs out there solving the problem already.<span> </span><br>
I guess CellProfiler has the same problem. Any opinions?<br></span></blockquote></div><br></div><br><font size="3"><span style="font-size:12px"><span style="font-size:medium"><br></span></span></font><div><span style="border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div style="word-wrap:break-word">
<span style="border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div style="font-family:Helvetica">
<span style="font-family:Helvetica">**************************</span></div><div style="font-family:Helvetica"><font face="Arial">Wellcome Trust Centre for Gene Regulation & Expression</font></div><div style="font-family:Helvetica">
<font face="Arial">College of Life Sciences</font></div><div style="font-family:Helvetica"><font face="Arial">MSI/WTB/JBC Complex</font></div><div style="font-family:Helvetica"><span style="font-family:Helvetica">University of Dundee</span></div>
<div style="font-family:Helvetica"><span style="font-family:Helvetica">Dow Street</span></div><div style="font-family:Helvetica"><span style="font-family:Helvetica">Dundee DD1 5EH</span></div><div style="font-family:Helvetica">
<span style="font-family:Helvetica">United Kingdom</span></div><div style="font-family:Helvetica"><br style="font-family:Helvetica"></div><div style="font-family:Helvetica"><span style="font-family:Helvetica">phone (01382) 385819</span></div>
<div style="font-family:Helvetica"><span style="font-family:Helvetica">Intl phone: 44 1382 385819 </span></div><div style="font-family:Helvetica"><span style="font-family:Helvetica">FAX (01382) 388072 </span></div><div style="font-family:Helvetica">
<span style="font-family:Helvetica">email: <a href="mailto:jason@lifesci.dundee.ac.uk" target="_blank">jason@lifesci.dundee.ac.uk</a></span></div><div style="font-family:Helvetica"><br style="font-family:Helvetica"></div>
<div style="font-family:Helvetica"><span style="font-family:Helvetica">Lab Page: <a href="http://gre.lifesci.dundee.ac.uk/staff/jason_swedlow.html" target="_blank">http://gre.lifesci.dundee.ac.uk/staff/jason_swedlow.html</a></span></div>
<div style="font-family:Helvetica"><span style="font-family:Helvetica">Open Microscopy Environment: <a href="http://openmicroscopy.org" target="_blank">http://openmicroscopy.org</a></span></div><div style="font-family:Helvetica">
<span style="font-family:Helvetica">**************************</span></div><div style="font-family:Helvetica"><br></div><div style="font-family:Helvetica"><div>The University of Dundee is a Scottish Registered Charity, No. SC015096.</div>
</div><br></span><br></div></span> </div><br></div><br>_______________________________________________<br>
ome-devel mailing list<br>
<a href="mailto:ome-devel@lists.openmicroscopy.org.uk">ome-devel@lists.openmicroscopy.org.uk</a><br>
<a href="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel" target="_blank">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel</a><br>
<br></blockquote></div><br></div></div>