[ome-devel] OMERO.features: Development of a new API for storing image features

Jason Swedlow j.r.swedlow at dundee.ac.uk
Fri Jul 19 22:35:41 BST 2013


Hi All

A quick plea not to drop this thread. Input from Ivan and Chris C. would be most welcome. These applications are very important and getting this API nailed down-- at least for a first draft-- would be hugely helpful.

Cheers,

Jason


Jason Swedlow, PhD, FRSE
Centre for Gene Regulation & Expression
Open Microscopy Environment
University of Dundee
http://openmicroscopy.org




Lee Kamentsky <leek at broadinstitute.org> wrote:

Hi all,
I think it's great that Bob Murphy's group has implemented pyslic and pyslid in an open-source framework like OMERO. It looks like a substantial body of work. I'm wondering what needs to be done to make it a general-purpose framework however, especially looking at it from the perspective of our group's experience with CellProfiler. Also, Simon, thanks for moving this forward.

My reading of the pyslic code is that it supports a nuclear stain and a protein stain and calculates a standard set of per-image and per-object features (although I haven't quite figured out the storage mechanism for the object features). This is adequate for a large class of experiments involving two-color fluorescently-labeled samples and it's likely the methods are robust, but our experience has been that experimental protocols can be more varied (multiple protein stains, brightfield images) and the biological questions can require additional image preprocessing to highlight the structures of interest, often requiring tuning parameters specific to the structure scale. Because of this, I think that the framework needs a modular architecture that supports development of new algorithms by computational researchers and configuration by the end users and it needs to extend beyond a curated code-base to allow for innovation. Personally, I'm really pleased that the framework is in Python because it aligns well with our group, but perhaps this is limiting for the ImageJ community and perhaps some portion of CellProfiler's bridge between Python and ImageJ could be adapted to supply the connection.

I think that we do need a platform for innovation and the keys to that are interoperability, standards, and a model of the analysis that is flexible enough to describe our community's experiments and that captures the analysis protocol in a reproducible manner. I'm going to outline my perspective on the model here, drawing on our group's experience with CellProfiler, and try to keep it brief. I see the components of the model being:

* Fields of view - N dimensional spaces (X, Y, T, Z, spectral) representing an imaging site
* Images - acquired image data on a field of view (with acquisition metadata) or similar produced by algorithms such as filters or morphological operations.
* Segmentations - defining multiple regions of interest on the fields of view or on (hyper)planes of the fields of view
* Relationships between segmented regions - links between segmented regions either within segmentations or across them. Examples might be time-lapse cell tracking, associations between nuclear and cellular segmentations or groupings of organelle segmentations within a cell.
* Measurements - data computed on the images, segmentations and relationships within a field of view. My take on this is that a measurement produces a numeric feature value per image or per segmentation region, but perhaps that's too narrow.
* Protocol - a description of how to perform the analysis. I think the key elements are a link to the OMERO screen and a list of the parameterized algorithms to be performed. The screen provides image inputs to the algorithms which are the available image acquisition channels and the algorithms themselves provide images, segmentations, relationships and measurements which can serve as inputs to other algorithms in the protocol. Algorithms will often be parameterizable by the user and these parameters should be captured by the protocol. Ideally, the protocol should capture the versions of the algorithms using a mechanism such as a GIT hash. In CellProfiler, we have algorithms that produce an aggregated image based on samples from many fields of view, for instance an estimate of differences in signal magnitude across the field of view caused by non-uniform illumination - algorithms might have stacks of images as inputs and these stacks might span individual fields of view.

As far as the actual mechanics, I see OMERO or similar using the protocol as a dependency graph, fetching the algorithms using some community-standard mechanism (maven? pip?), providing inputs as specified by the protocol and harvesting the outputs for the database and for dependent algorithms. I have some detailed concerns about algorithm input/output introspection and discovery, but ImageJ 2.0's plugin introspection protocol (@parameter) is a good starting point (thanks ImageJ 2.0).

OK - somewhat CellProfiler-centric perhaps, but the nice thing about OMERO is that it is a relational database and the protocol is the thing itself - not a description of the experiment, but a mineable map of how each number is produced especially if the protocol pieces are described relationally in the database. I think the above is an ambitious undertaking, but look at the result! Researchers can trade protocols which produce robust and comparable values (not just "nuclear area", but the nuclear area after illumination correction and segmentation using Otsu thresholding and a seeded watershed of HeLa cells stained with DAPI). Developers can publish their method in OMERO and possibly OMERO itself can generate citations based on a protocol, leading to better accreditation of our work. And OMERO itself becomes a sustainable platform for analysis with a well-defined interoperable API for image processing.

Hope this all gives things a positive lift, thx for reading this far,
--Lee

On Fri, Jul 5, 2013 at 10:03 AM, Simon Li <s.p.li at dundee.ac.uk<mailto:s.p.li at dundee.ac.uk>> wrote:
Hi everyone

It was great to see so many people interested in OMERO.searcher and WND-CHRM at the Paris meeting, both those who were interested in installing it on their own systems and also those of you who were interested in developing other analysis algorithms for use with OMERO.

One of the main points that came up was that OMERO should provide a single API for storing and calculating image features. Robert Murphy's group at CMU have already developed PySLID [http://github.com/icaoberg/pyslid], a python module for calculating and storing features used with OMERO.searcher, so I'd like to propose we bring this into the openmicroscopy GitHub organisation, and rename it to OMERO.features (other suggestions are welcome).
Then there's the much bigger task of modifying the module to cater for everyone's requirements. I can see several potential issues, including how we handle multiple channels, z-slices, timepoints, ROIs, etc since features can be calculated for these individually or as a whole.

If anyone has any thoughts or comments on what they'd like to see it'd be great if you could share them with the rest of this list, or if you prefer on our forums.

Best wishes

Simon



The University of Dundee is a registered Scottish Charity, No: SC015096

_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>
http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel


The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20130719/a8f6a134/attachment-0001.html>


More information about the ome-devel mailing list