[ome-devel] Database Issues: FK, UNIQUE, and such

Wed Jun 1 06:01:07 BST 2005

>>> III. One-to-One
>
>> We don't currently support explicit "has-one" relationships using
>> references.  So for now at least, an image can in fact have multiple
> dimensions.
>
> Back to data integrity, does it make sense for an image to have  
> multiple
> dimensions, and how to I deal with that programmatically?
>
>   if (dimensions.length > 1) throw new OopsException()
>
>>>  * Should the maps have a UNIQUE constraint on both fields.
>>
>> Explicit many-to-many relationships (maps) are not supported in STs.
>> This could in-fact be done transparently by saying that an ST
>> consisting of only references gets a UNIQUE constraint on all its
>> fields.
>
>
> Sounds awfully implementation-ly. What about whether they /should/ have
> them? And again, how do I take care of it programmatically if there's  
> the
> possibility that there will be multiple maps (to get all  
> implementation-ly
> myself)?

The relationship between an experimenter and group is done using an  
ExperimenterGroup. According to our rules of usage, an experimenter may  
belong to many groups, and a group may consist of many experimenters.  
If you put a UNIQUE constraint on both fields for that ST, that would  
break our usage. The programatic approach would also mess with  
ImageInstrument, which consists of a reference to an instrument, an  
objective, and an image.
Dimensions and ImageExperiment could use some UNIQUE constraints on  
image though.

Really, this stuff has to be evaluated on a case by case basis, because  
our type definitions don't give you enough information to deal with  
these. You could extend the xml type definition (actually not hard and  
has little impact if you use a supplementary schema) to provide the  
constraint and usage information.
Some extension of our data definition is needed if you want to  
implement these things. I personally think they are very nice to have  
around.
So, option 1 is to specify this info in XML.
	If you extend into another namespace or write in new attributes, we  
won't have problems with backwards compatibility. This approach would  
be adding a few small bits of syntax onto our existing STDs.
	Alternately, you could write a stylesheet to translate back and forth  
from our STD schema into some subset of OWL class definitions. You  
could use the broader syntax of OWL to specify those addition  
constraints. That additional info would be lost when converted to OME,  
but that's ok.

Option 2 is to make broader use of established programming patterns.  
OME::Java coaxes written class definitions from an XML or perl package  
class definition. You can then programatically adjust this package  
definition to meet your needs. The nature of the augments can be  
described in developer level textual documentation in the XML class  
definition, or in the java class (depending on how generally useful it  
was).
	Currently, ST perl package definitions are generated on the fly, but  
there's no reason why they couldn't be generated automatically, saved  
to an OME library on disk, and loaded in as needed. We don't use this  
pattern in those circumstances, but we readily could.
	I use a variant of this pattern in the Web UI code that provides views  
of the whole database. But instead of generating a package definition  
for each class, I define only those classes that I need to extend. All  
request for services are brokered through the service superclass. The  
superclass looks for a specialized subclass on disk it can load up. If  
it can find one, it will give the subclass first dibs on fulfilling all  
or a portion of the request.
	OME::Web::DBObjRender::__OME_Image is an example of this. It  
implements thumb_url and current_annotation "virtual fields", which is  
a round about way of letting me attach business logic to displays. It  
implements them by recognizing keys from a hash of incoming requests. A  
cleaner API would be to define them as new methods of a Web UI  
subclass.
	The html templates work in a similar way. Those would actually be an  
excellent place to use the pure pattern of spooling a specialized  
subclass definition to disk and letting folks tweak it as needed.
	This pattern is actually very scalable as complexity of the data model  
increases. If today we don't allow the nature of foreign constraints to  
be specified in our STDs, that doesn't have to slow down cutting edge  
development for a java library as long as the java augments are clearly  
and fully documented. If we later wanted to specify those in STDs, we  
can automatically programatically generate those augments, and update  
the java library. A simple diff could flag classes as being augmented  
or no, and would streamline library updates.

	Actually, I believe that a broad use of a pattern like this is going  
to be needed for UI development to proceed at the pace of the evolution  
of the data model. But for this immediate problem, it's one of a few  
workable solutions. And if we used that pattern for this problem, we  
wouldn't be extending it into every cranny of the code base, just this  
one.
I feel I must add these caveats to broad reaching statements to keep  
the flames from flaring up.

But to get back to your specific questions, I feel that it's fine to  
enforce one-to-one relations and other constraints programatically. You  
should realize such enforcement is above and beyond what is done and  
will likely be done in the perl libraries. Which is fine. There's no  
reason for a new system to keep the weaknesses of the old.
Since we all agree those constraints would be nice to have in the  
objects we program with, the question comes down to, where do you get  
the necessary information? An XML file or a package definition? Does  
that definition information make it into the STDs stored in the  
database?

Whatever plan we pick needs to have little impact in the short term and  
have a roadmap for scalability and tighter integration.
In the next several months, I can help out on this with XML schema  
extensions and/or design discussions. The majority of my time will be  
occupied with biology experiments and Web UI development.

>>>  * Projects and datasets are many-many. CGs and Cats are 1-Many.
>>> Screens and
>>> plates as well? What is/will be the rule-of-thumb for these (and
>>> future)
>>> containers?
>>
>> What did you have in mind?
>
> What I had in mind was, roughly, that there seems to be such a parallel
> drawn between PDI and CGCI in many (public) arena. If we ignore that  
> PDI is
> somehow class based and CGCI is ST based, one might expect that they  
> obey
> the same hierarchy constructs. Certainly not a must, it's just what  
> I've
> come to expect from how we've talked about the various hierarchies.

Do these hierarchy containers allow for the organizing schemes that  
Alastair Kerr described as in use by their labs?
	http://lists.openmicroscopy.org.uk/pipermail/ome-users/2005-May/ 
000043.html
If they don't, then they are too limiting. If they do, those containers  
might make it as a good rule of thumb. There will always be the  
occasional curve ball graph that it won't handle, but rules of thumbs  
are for the 80/20 cases.

Josiah