[ome-devel] Analysis Chain Methodology
Ilya Goldberg
igg at nih.gov
Thu Mar 8 18:19:45 GMT 2007
Hi Brian
There is a "philosophical" reason for it. Generally an experiment
consists of an experimental sample and one or more controls which are
treated the same as the experimental sample. So philosophically, in
a real experiment, one would never do it with just a single sample or
image.
Practically, one very often wants to do analysis on a set of images
rather than just one (primarily for the reason above), so building a
system that deals with only multiple images, you get the system that
does a single image (a list of one) for free.
I'm not being smart-alecky, just trying to explain the rationale
behind the design (or at least point out there was one). I don't
know if I would be out of line in saying that for an experimentalist,
doing processing on a single image could be considered an "edge case".
Back to philosophy. I think where we went wrong (possibly) is that
we've overloaded the concept of "Dataset". One meaning of Dataset is
a user-land named collection of images - i.e. for the user's
organizational purposes. Another meaning is that its a collection of
images that were processed in exactly the same way. Often times you
want a single container for both - often enough that we decided to
make it one container. There are obviously exceptions, and we've
used "hacks" or tricks to get around them. One very visible hack is
that when you import images (even just one), you are forced to either
add them to a Dataset or create a new one. This is because importing
images results in executing an "Import Chain". Having the user put
the imported images into some sort of organizational hierarchy rather
than just have them float around seemed like a good thing (its a
feature, we cried, not a bug!).
Alright, so with that out of the way, what's to be done? Code-wise,
it could significantly complicate things (but see below) if the AE
had to explicitly deal with images and/or datasets in its internals.
The AE could in fact be redesigned to address the idea of iterators
in a more general way - it should in principle be able to iterate
over any object (for an example, an Image). Currently, it iterates
over only "Datasets". Datasets "contain" images, and most modules
are image-granularity. Generally, it would iterate over any
container for objects that modules operate on. This would be a good
masters project, possibly even a PhD project. There's some cool
formalism here dealing with computational work-flows, graph theory,
and all manner of other juicy tidbits to make the hapless student's
brain explode.
Another option is to put something at the very outer layer of the AE
that will accept an image as the target of a chain execution, and
implicitly make a dataset for it. It could even be a "special"
dataset so that it doesn't appear in the UI. Though, if you looked
at the target of the chain execution, it would still be a Dataset
(containing a single image). An ugly hack to be sure, but possibly a
practical solution if this is truly an "edge case" as I've claimed.
There could be a third option, which is more directly what you're
asking for. The linkage between datasets and chain executions is in
only one place - in the AnalysisChainExecution object. Everything
else about the chain execution (the NodeExecution, for example)
refers to module executions. Just like ModuleExecutions,
ChainExecutions can have a more general "target" instead of a
"dataset". This would take care of recording what was done, at least
in the data model. As for the support code, we could re-use the
ModuleExecution pattern for dealing with what is essentially a run-
time-typed reference to the target. How much code that would
involve, I don't rightly know without looking carefully, but I
suspect not a trivial amount. My hunch is that it could be doable
though. The advantage of this is that its not going quite as far as
option 1, and would not be a blatant hack like option 2.
-Ilya
On Mar 5, 2007, at 8:17 PM, Brian Ruttenberg wrote:
> I have a question about the Analysis Chain methodology employed by
> OME.
>
> Unless I am missing something, it seems that Analysis Chains can
> only be
> run on data sets. This is a major stumbling block for us. We really
> want to run chains on single images, since our data sets are large and
> the inputs to the chain are variable (it may not be appropriate to
> have
> the same value for every image in the dataset). Right now I am doing
> some ugly tricks to get a chain to run on a single image - which
> basically boils down to creating a dummy dataset with one image in it.
> Obviously, this is less than ideal.
>
> I'm just curious what the reason was for not having per image chain
> support? And, would anyone know how difficult or time intensive it
> would be to modify OME to get it to work on single images?
>
> Thanks!
>
> Brian
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>
More information about the ome-devel
mailing list