[ome-devel] A few questions

Tue Jul 13 03:39:13 BST 2004

On Jul 12, 2004, at 4:38 PM, Zachary Pincus wrote:

> Ilya,
>
> Thanks for the detailed replies. After a couple more reads through the 
> OME documentation, I think I'll have it all mostly figured out.
>
> Now, to try your patience further, I've got a bit of a question about 
> the feature/attribute system. Given your comments, it seems that 
> features are unstructured "bags" to which feature-granularity semantic 
> types can be assigned as attributes. Is this correct?
That's exactly right.  We prefer "containers", but "bags" works too.

> Also, when an image is exported as an OME XML file, are the 
> feature-level ST instances physically contained within the appropriate 
> "feature" entry (as feature-level "CustomAttributes")? I would check 
> this myself, but I can't seem to get image features exported with the 
> rest of the attributes (not sure if this is a bug or a problem on my 
> end).
That's how its supposed to work.  I know it worked at some point, but 
(sadly), its not one of our standard test suite, so it may have broken.
One thing to keep in mind is that there's no clear UI right now to 
select what gets exported.  Its quite possible to trace things deeply 
enough and export an entire database into an OME document.  Josiah is 
working on a "shopping cart" approach to selecting what you want to 
export.  Presently, the system only exports what was imported from the 
original file.  There's a command-line program in the distribution 
called ExportSession.pl which gives you slightly more control over what 
gets exported.  The API to the XML export needs only a list of objects 
and it takes it from there.  BTW, this little program answers your 
question of "How do I get all existing attributes of container X?".

> Speaking of these custom attributes -- is there a proper way to add 
> attributes to a set of pixels? Say, for example, that an analysis 
> requires a binarized image: is there any way to annotate certain pixel 
> sets as having a "binary" attribute, and then declare that the module 
> will only run on such pixels? That is, there's a lot of type-checking 
> on the other inputs to analysis modules; is there any way to do this 
> for pixels?
Great, eventually someone was going to bring up inheritance but I 
thought it would take a while longer.
Sigh.
STs as presently implemented don't do inheritance.  That means that if 
you wanted to create a more restricted form of Pixels, you would have 
to create a new ST (called Mask?).  The presently defined Pixels are 
quite flexible (1, 2, or 4 bytes, signed or unsigned integers or 4-byte 
floats), but there is no 'binary' designation.  So Mask and Pixels can 
actually be distinct.  Whew, this wasn't about inheritance after all.
There is some handler code that allows modules to specify pixel 
bit-depth.  Its a bit of a kluge and may not even be working presently, 
but eventually issues like these will be dealt with.  The handlers are 
going through a bit of flux these days (in the cvs main branch) - 
especially the Matlab handler, which is going to be our group's 
workhorse.

> Finally, on to a separate issue: for parts of my project, I might be 
> using the Remote framework to "check out" images and then perform 
> analysis on them. (E.g. for interactive processing, or for distributed 
> computing.) Is there any way to do this sort of work within the 
> context of the "Module Execution" paradigm? That is, to record a trace 
> of the various parameters, inputs, and outputs? I guess one could 
> create stub modules and executions and fill in the appropriate inputs 
> and outputs; is there a specifically "approved" way to do this? I've 
> read a little about this in the docs, but not enough for a complete 
> picture.
Right.  This has come up before several times.  There's no API for this 
right now, but you can do this by formulating an XML document 
describing what you've done.  Look at the DataHistory schema.  You have 
the right idea.  Basically this document would declare a bunch of no-op 
modules, the chain they were executed in, the inputs/outputs for them, 
and the actual input/outputs as CustomAttributes.  You can even 
continue this chain and use the outputs for subsequent back-end 
modules, so the data would be fully integrated.  It might take a little 
massaging to trap the fact that its not really a no-op module and that 
it was executed on a stand-alone client.  That way your specific 
outputs can be piped downstream, but the back-end wouldn't try to 
execute these modules on any new inputs.

An API for stand-alone analysis apps is something that's definitely on 
the plate, but our resources are tapped out right now.  Our top 
priority is getting the back-end analysis in shape and running through 
some "screens".
Distributed computing is something we've all been thinking about as 
well (and toying with in our spare time), but I think these two things 
are fundamentally different.  Interactive processing implies a 
stand-alone application talking to OME and using it as a data store.  
Basically, an incremental improvement on the prevailing paradigm.  
Distributed computing essentially follows the existing OME model, using 
the back-end not only a data store, but an image processor as well.  
With distributed processing, end-user clients would still fire off an 
analysis to be done "elsewhere".  If all goes well, existing modules, 
clients, etc. wouldn't even need to know that the request went out to a 
"grid" instead of a lowly PC sitting in a closet.  That's the promise 
anyway.  Personally, I would much rather see us head towards 
distributed rather than stand-alone.  Its the new new thing.

-Ilya

>
>
> Thanks again,
>
> Zach
>
>
>
> On Jul 8, 2004, at 12:23 PM, Ilya Goldberg wrote:
>
>> Hi Zach
>>
>> On Jul 6, 2004, at 4:19 PM, Zachary Pincus wrote:
>>
>>> Hello,
>>>
>>> I've got a few general questions about OME, to help me get my 
>>> footing. (Sorry to spam the developer list if this isn't really 
>>> appropriate traffic -- let me know.)
>> I think it is.  What does everyone else think?
>>
>>>
>>> Also, since I know so little about the guts of the system, some of 
>>> these might actually be feature requests, etc -- so let me know if 
>>> that's the case and I'll enter them into the bug-tracker.
>>>
>>> (1)  I quite like how each image can be associated with various 
>>> features. Is there any way to associate an image with features that 
>>> are themselves images? (e.g. the spots resulting from spot-finding 
>>> or some such.) Or is the appropriate way to deal with this just by 
>>> following an analysis chain backwards to find the "parent" image?
>> A Feature is a general container of attributes, so yes one of the 
>> attributes can be a set of pixels.  To keep things sane though, you'd 
>> probably want a Feature's pixels to belong to only that feature and 
>> not be for the entire image.  An ROI (region of interest) is a good 
>> example of this.  It could refer to the image's pixels by coordinate, 
>> but it can be its own set of pixels containing a mask (or some other 
>> transformation/filter).
>>
>>>
>>> (2) At the moment, are there any tools for creating 5-D images from 
>>> multiple lower-D image files? Clearly this can be done with the 
>>> omeis API, but has this functionality been bundled up in a tool 
>>> anywhere?
>> The TIFF importer looks for a specific regular expression pattern 
>> that is commonly used by MetaMorph.  Its easy to change this pattern 
>> so it builds up 5-D images out of collections of 2-D TIFFs, but the 
>> Web UI does not provide an interactive tool to manipulate this 
>> pattern.  VisBio does though, and it knows how to talk to OME.  Look 
>> at:
>> http://www.loci.wisc.edu/visbio/alpha.html
>> Only the 3.0 version of VisBio (in alpha) supports OME import.
>> If you have some ideas about how to specify this using other UIs, 
>> please let us know!
>>
>>>
>>> (3) Is there a UI for editing image metadata? After importing 
>>> images, I'd love to add more metadata about the microscope, CCD, 
>>> illumination, etc (i.e. everything supported by the OME XML 
>>> interchange format) that are not present in the original image 
>>> files/accessible to the import engine. (Corollary question: if so, 
>>> can this be done in a batch-mode to multiple images at once?) Again, 
>>> clearly the API can handle this -- I'm just wondering if this has 
>>> been added/planned for the various UIs.
>> Sorry, nothing in the UIs yet.  The idea is to define the microscope 
>> once, then refer to its components by ID in the various STs that link 
>> acquisition parameters to images.
>>
>>>
>>> (4) How are units dealt with in OME? The schema specify things like 
>>> "PixelSizeX" or "Wavelength" but the units don't seem to be 
>>> specified. In general they are obvious for biological imaging 
>>> (microns, nanometers) -- is that enough?
>> The units are implied.  They are specified in the schema 
>> documentation - the two examples you gave have specified units.  If 
>> you find some that don't please let us know.  Look here for a 
>> graphical display of the schema:
>> http://docs.openmicroscopy.org.uk/api/xml/OME/ome-image.html
>> hold the mouse over the PixelSizeX attribute, and it displays a 
>> little note that its in microns.  If you click on it and go to the 
>> schema reference, the units are specified in the Image description, 
>> but should be repeated in the attribute comments as well, I suppose.
>>
>>> (5) I'm a bit unclear as to how one goes about asking for "all 
>>> relevant information" about a feature from the database or the web 
>>> UI. Let me provide an example -- After running "find spots," 
>>> information about various features is generated and stored in 
>>> several database tables: Signals, Extent, Location, Threshold, and 
>>> Timepoint.
>>>
>>> I know this because the "find spots" module specifically declares 
>>> this. So if I want to find out the "Extent" associated with a given 
>>> feature, I need to look at the appropriate "Module Execution" record 
>>> in the web UI. Already something is strange -- why can't I look at 
>>> the "Feature" record and see this information?
>> I think the patch release will have a way to look this up on features 
>> the same way you can with images (see Bug #136).  Don't quote me on 
>> that though.  We're working on a more comprehensive data browser for 
>> a later release.  This release was essentially an image repository.  
>> All the bits to make a comprehensive image analysis system (and data 
>> browser) are being worked on.
>>
>>> Perhaps this is just a UI issue, but the problem seems deeper: 
>>> nowhere in the "Feature" record is any information about the tables 
>>> in which the attributes of the feature are stored. Nor does there 
>>> appear to be a backwards reference to the module execution that 
>>> generated that feature, where you could find that information.
>> Since a feature is just a container which has no implied attributes 
>> whatsoever, the feature is entirely defined by the attributes it has. 
>>  Each feature attribute table has a MEX column, which tells you how 
>> it came about.
>>
>>> (And what if a feature wasn't added by a module execution, but 
>>> manually?)
>> There are two ways to deal with manual entry.  One is by assigning a 
>> MEX, in which case the manual entry acts just like a module output.  
>> The other way is by assigning a NULL MEX.  This makes the entry 
>> mutable - in the sense that the values can be changed (updated).  
>> Full blown attributes with non-null MEXes are immutable.  Again, its 
>> meaningless to create a feature.  It only means anything if you 
>> create an attribute for it.  Once you make an attribute, you have a 
>> MEX, and you have data history/dependency and go from there (assuming 
>> its not NULL).  Mutable attributes with NULL MEXes may be dropped in 
>> a future release.
>>
>>>
>>> So, given only a feature ID, how do you find all the attributes of 
>>> that feature? It appears to my uninformed eye that you would need to 
>>> trawl through the entire database and ask each table for all records 
>>> with feature ID = whatever. Is there any unified mechanism for doing 
>>> this (in SQL or the API)? Or am I missing something fundamental 
>>> here?
>> Nope, you're not missing anything fundamental.  You have to do the 
>> query in at least two parts, and you can't do it purely in SQL.  You 
>> have to get a list of tables that are used by semantic types of 
>> granularity 'F', then you have to query each of those tables for your 
>> feature ID.  The API provides facilities for doing this without 
>> resorting to SQL.
>>
>>> (6) Finally, I'm wondering in what areas the APIs are just thin 
>>> wrappers around the database, and in what areas there is a lot of 
>>> logic in the code itself. This question is just to get me oriented 
>>> to what the API does, and where...
>> Make sure you read the newbie guide:
>> http://docs.openmicroscopy.org.uk/newbie/
>>
>> The API follows a factory pattern, so most objects (other than the 
>> Session) are instantiated using the OME::Factory class 
>> (src/perl2/OME/Factory.pm).  All OME Objects that have DB 
>> counterparts inherit from DBObject.  Attributes inherit from 
>> SemanticType (which inherits from DBObject), but the vast majority of 
>> these are constructed at run-time, so they don't have package 
>> counterparts.  There is a lot of code in these three objects 
>> (especially DBObject and the classes it uses) that deal with 
>> translating the relational DB to the OO API.  There's also a bunch of 
>> logic in the AnalysisEngine.pm which drives module execution.  Other 
>> than the relational/OO translation and the AnalysisEngine, its all a 
>> fairly skinny API on top of the DB.  There are a set of classes in 
>> the Tasks directory that have some application logic as well, but 
>> they're basically a bunch of utility methods and also fairly thin.  
>> Lastly, importers for various native file formats live in 
>> OME::ImportEngine.  Things in the OME::Web class hierarchy implement 
>> the html UI, and contain a fair amount of application logic.  
>> Overall, we've tried to minimize the amount of application logic 
>> because most of it derives directly from the data model, which is 
>> run-time extensible.
>>
>> Hope that helps
>> -Ilya
>>
>>
>>
>>>
>>> Thanks for any help at all,
>>>
>>> Zach Pincus
>>>
>>>
>>> Department of Biochemistry and Program in Biomedical Informatics
>>> Stanford University School of Medicine
>>>
>>> _______________________________________________
>>> ome-devel mailing list
>>> ome-devel at lists.openmicroscopy.org.uk
>>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>>>
>>
>