[ome-devel] OMERO HPC and Hadoop

Simon Li s.p.li at dundee.ac.uk
Fri Aug 29 16:34:01 BST 2014


Hi everyone

As many of you know, Simone Leo from Gianluigi Zanetti's group at CRS4 has
been working in Dundee for the past few months, exploring ways to
interface OMERO with an HPC cluster.  We thought some people on this list
might be interested in the current progress.

Most of the work so far has been on trying to use Hadoop
(http://hadoop.apache.org) with OMERO using Pydoop
(http://pydoop.sourceforge.net), a Python interface for Hadoop developed
by CRS4 (http://www.crs4.it).  There are two main strands to this work:
sending jobs for processing from OMERO to Hadoop, and transferring data
in/out of the Hadoop cluster.

We're using image feature calculation with WND-CHARM
(https://github.com/wnd-charm/wnd-charm) as a use case.  I'm currently
able to calculate features for multiple images simultaneously using a
local Hadoop test cluster
(https://github.com/manics/pydoop-features/tree/self-contained), the next
step is to trigger this from an OMERO.script. Given that Pydoop is a
Python library, this looked like it¹d be straightforward, though so far
I¹m running into problems with conflicting OMERO and Hadoop CLASSPATHs. At
present there¹s a fair amount of configuration required to get the code to
Run, this will hopefully be simplified in future.

Simone has been working on getting OMERO to read/write to HDFS (Hadoop
Distributed File System) directly, by a combination of updating the
path.py (https://github.com/jaraco/path.py) library used by the Python
components of OMERO, and developing an HDFS implementation of the same
library (https://github.com/simleo/path.py/tree/hadoop) that OMERO could
use.  The Java side of things still requires substantial work.  Looking at
the bigger picture this could form the start of a file system abstraction
layer for OMERO, which could allow implementation of other arbitrary file
access protocols.

This is still very early work, but as always if you're interested or have
done something similar get in touch on this list or via the forums, or on
the wiki (should be editable by all GitHub users):
https://github.com/simleo/pydoop-features/wiki

Best wishes

Simon


The University of Dundee is a registered Scottish Charity, No: SC015096


More information about the ome-devel mailing list