[ome-users] Using Bio-formats through Python

Thu Nov 1 04:59:48 GMT 2012

Hi all,
  I work on a project called MyTardis [1] that is a Django-based
research data repository used particularly for protein
crystallography, electron microscopy, and synchrotron science at
research institutions around Australia. New deployments typically
spend time writing filters to read file formats, mostly in order to
extract metadata (occasionally data visualisation too). It's occurred
to us[2] that it might make more sense to use (and contribute to) an
existing library like Bio-formats. The data files include standard
image files (.tif), instrument-specific image files (eg, FEI,
Philips...), numeric spectrum data, and I think some 3D formats. We
may be starting work on some neuroimaging data (NIFTI etc) soon.

Obviously, Bio-formats is Java, and MyTardis is Python. Judging from
http://loci.wisc.edu/bio-formats/interfacing-non-java-code, the most
sensible approach is perhaps Ice?

So, I was wondering if anyone has experience with this kind of Python
usage, and perhaps has some sample code to share?

Our major use cases would be:
1) At the time of (asynchronous) import, MyTardis would call
Bio-formats to request specific information about the file: instrument
parameters, user name, image dimensions - whatever is appropriate.
2) (less important) Calling Bio-formats to extract pixel-level data,
or even do wholesale image conversions (eg, to .jpeg)

Performance is not a major consideration for most deployments  -
imports take place in the background, usually after the researcher has
left the instrument.

Thanks,
Steve

Victorian e-Research Strategic Initiative (Melbourne)

[1] http://mytardis.github.com/ - new web page in development
[2] Prompted by meeting Jason Swedlow at an e-Research conference in
Sydney this week.