[ome-devel] Database Issues: FK, UNIQUE, and such
Josiah Johnston
siah at mac.com
Wed Jun 1 06:01:07 BST 2005
>>> III. One-to-One
>
>> We don't currently support explicit "has-one" relationships using
>> references. So for now at least, an image can in fact have multiple
> dimensions.
>
> Back to data integrity, does it make sense for an image to have
> multiple
> dimensions, and how to I deal with that programmatically?
>
> if (dimensions.length > 1) throw new OopsException()
>
>>> * Should the maps have a UNIQUE constraint on both fields.
>>
>> Explicit many-to-many relationships (maps) are not supported in STs.
>> This could in-fact be done transparently by saying that an ST
>> consisting of only references gets a UNIQUE constraint on all its
>> fields.
>
>
> Sounds awfully implementation-ly. What about whether they /should/ have
> them? And again, how do I take care of it programmatically if there's
> the
> possibility that there will be multiple maps (to get all
> implementation-ly
> myself)?
The relationship between an experimenter and group is done using an
ExperimenterGroup. According to our rules of usage, an experimenter may
belong to many groups, and a group may consist of many experimenters.
If you put a UNIQUE constraint on both fields for that ST, that would
break our usage. The programatic approach would also mess with
ImageInstrument, which consists of a reference to an instrument, an
objective, and an image.
Dimensions and ImageExperiment could use some UNIQUE constraints on
image though.
Really, this stuff has to be evaluated on a case by case basis, because
our type definitions don't give you enough information to deal with
these. You could extend the xml type definition (actually not hard and
has little impact if you use a supplementary schema) to provide the
constraint and usage information.
Some extension of our data definition is needed if you want to
implement these things. I personally think they are very nice to have
around.
So, option 1 is to specify this info in XML.
If you extend into another namespace or write in new attributes, we
won't have problems with backwards compatibility. This approach would
be adding a few small bits of syntax onto our existing STDs.
Alternately, you could write a stylesheet to translate back and forth
from our STD schema into some subset of OWL class definitions. You
could use the broader syntax of OWL to specify those addition
constraints. That additional info would be lost when converted to OME,
but that's ok.
Option 2 is to make broader use of established programming patterns.
OME::Java coaxes written class definitions from an XML or perl package
class definition. You can then programatically adjust this package
definition to meet your needs. The nature of the augments can be
described in developer level textual documentation in the XML class
definition, or in the java class (depending on how generally useful it
was).
Currently, ST perl package definitions are generated on the fly, but
there's no reason why they couldn't be generated automatically, saved
to an OME library on disk, and loaded in as needed. We don't use this
pattern in those circumstances, but we readily could.
I use a variant of this pattern in the Web UI code that provides views
of the whole database. But instead of generating a package definition
for each class, I define only those classes that I need to extend. All
request for services are brokered through the service superclass. The
superclass looks for a specialized subclass on disk it can load up. If
it can find one, it will give the subclass first dibs on fulfilling all
or a portion of the request.
OME::Web::DBObjRender::__OME_Image is an example of this. It
implements thumb_url and current_annotation "virtual fields", which is
a round about way of letting me attach business logic to displays. It
implements them by recognizing keys from a hash of incoming requests. A
cleaner API would be to define them as new methods of a Web UI
subclass.
The html templates work in a similar way. Those would actually be an
excellent place to use the pure pattern of spooling a specialized
subclass definition to disk and letting folks tweak it as needed.
This pattern is actually very scalable as complexity of the data model
increases. If today we don't allow the nature of foreign constraints to
be specified in our STDs, that doesn't have to slow down cutting edge
development for a java library as long as the java augments are clearly
and fully documented. If we later wanted to specify those in STDs, we
can automatically programatically generate those augments, and update
the java library. A simple diff could flag classes as being augmented
or no, and would streamline library updates.
Actually, I believe that a broad use of a pattern like this is going
to be needed for UI development to proceed at the pace of the evolution
of the data model. But for this immediate problem, it's one of a few
workable solutions. And if we used that pattern for this problem, we
wouldn't be extending it into every cranny of the code base, just this
one.
I feel I must add these caveats to broad reaching statements to keep
the flames from flaring up.
But to get back to your specific questions, I feel that it's fine to
enforce one-to-one relations and other constraints programatically. You
should realize such enforcement is above and beyond what is done and
will likely be done in the perl libraries. Which is fine. There's no
reason for a new system to keep the weaknesses of the old.
Since we all agree those constraints would be nice to have in the
objects we program with, the question comes down to, where do you get
the necessary information? An XML file or a package definition? Does
that definition information make it into the STDs stored in the
database?
Whatever plan we pick needs to have little impact in the short term and
have a roadmap for scalability and tighter integration.
In the next several months, I can help out on this with XML schema
extensions and/or design discussions. The majority of my time will be
occupied with biology experiments and Web UI development.
>>> * Projects and datasets are many-many. CGs and Cats are 1-Many.
>>> Screens and
>>> plates as well? What is/will be the rule-of-thumb for these (and
>>> future)
>>> containers?
>>
>> What did you have in mind?
>
> What I had in mind was, roughly, that there seems to be such a parallel
> drawn between PDI and CGCI in many (public) arena. If we ignore that
> PDI is
> somehow class based and CGCI is ST based, one might expect that they
> obey
> the same hierarchy constructs. Certainly not a must, it's just what
> I've
> come to expect from how we've talked about the various hierarchies.
Do these hierarchy containers allow for the organizing schemes that
Alastair Kerr described as in use by their labs?
http://lists.openmicroscopy.org.uk/pipermail/ome-users/2005-May/
000043.html
If they don't, then they are too limiting. If they do, those containers
might make it as a good rule of thumb. There will always be the
occasional curve ball graph that it won't handle, but rules of thumbs
are for the 80/20 cases.
Josiah
More information about the ome-devel
mailing list