[ome-devel] OMERO.features: Development of a new API for storing image features

Lee Kamentsky leek at broadinstitute.org
Tue Aug 27 20:15:41 BST 2013


Hi Ivan


On Tue, Aug 27, 2013 at 2:26 PM, Ivan E. Cao-Berg
<icaoberg at andrew.cmu.edu>wrote:

> hi! everything sounds great. i have a couple of questions and comments.
>
> > 2. ROI Preprocessing options must be an essential part of the feature
> > storage framework, since a single ROI with different preprocessing can
> > result in different feature values.
>
> totally agree. then the question i ask is, how will this affect the
> community as a whole? i havent used cellprofiler, i have only used
> knime. can we guarantee numerical accuracy across any system given a
> particular version of the software? (not rethorical, i have no idea).
>
> For CellProfiler, absolute reproducibility is a goal, but we don't specify
our dependencies with enough accuracy to guarantee this across platforms.
Pragmatically, we have taken some care to make calculations reproducible
(seeds for pseudo-random numbers, etc) and the results should be comparable
across platforms. Someday, we'll reach that goal...

how easy will it be to exchange that information between systems?
>
I think that there is too much leeway in the implementation of algorithms
to have the feature output of one software package exactly match that of
another. Perhaps at some point there will be an exact ontology of features
that can guarantee the same result from different systems, but I am
guessing it's too early for that effort.

do we want to share a feature table/vector or we want to share a process
> that can run on their own system?
>

In the context of CellProfiler, my plan is for CellProfiler to access the
OMERO server from a client process or from multiple client processes run on
a cluster. CellProfiler would upload features to OMERO in this scenario. I
think it would be pretty cool if an OMERO client could request a
CellProfiler analysis on an OMERO server, but that's not currently on our
schedule.

>
> even though these issues seems unimportant i think they are. if people are
> going to be publishing data online along with
> their research articles reproducibility of certain calculations, say
> feature values, is very important. how can we guarantee people can
> reproduce
> those results?
>
>
> other things i would like to point out
> 1) in terms of feature calculation it is essential that we keep track of
> the resolution at which features were calculated
>
CellProfiler also needs some mechanism for annotating a feature vector with
the parameterization that's needed to reproduce the analysis.


> 2) we should have a clear method that just links features to a database.
> some people will not want to recalculate features if they have already
> done it. some feature sets are computationally expensive
>
A CellProfiler analysis of a field of view typically takes on the order of
a minute to compute. I think recalculation on the fly isn't a good option
for us.

--Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20130827/edd04112/attachment.html>


More information about the ome-devel mailing list