[ome-devel] Shoola-back-end consistency issues

Ilya Goldberg igg at nih.gov
Thu Jul 7 00:22:55 BST 2005


On Jul 6, 2005, at 4:23 PM, Josh Moore wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>>> Ilya Goldberg wrote:
>
>>> Great, and when client X modifies the DB using VBA/.NET, which 
>>> magical
>>> messaging and distributed objects pattern will ensure that client Y
>>> using ICE will update itself to be consistent?
>>> Presumably these modifications would end up being processed by some
>>> back-end aggregator that will then go and issue messages to any
>>> connected clients to update themselves.  This aggregator will have to
>>> perform translation services for any and all interfaces used by
>>> clients.  Will it go and update the Web UI the user's got open also?
>>> What about clients that connect via JDBC/ODBC?
>
>>> I don't think there are any magic bullets here.  If we want a 
>>> monolithic
>>> system where we control all of the ins and outs explicitly, this all
>>> becomes fairly simple using whatever we pull off a shelf.  If we 
>>> want a
>>> system that's maximally open with many ways in and out (the "O" in 
>>> OME),
>>> then this becomes a more difficult (not to say interesting) problem.
>>> When deciding on these interfaces, we should be thinking "Who and 
>>> what
>>> can we afford to exclude from access into this system?".  Its
>>> instructive to look at a long list of real-world systems that we 
>>> need to
>>> talk to - not hypothetically sometime in the future, but like
>>> yesterday.  None of these things use ICE or Java or anything even
>>> remotely that intelligent.  Some can do web-services or at least 
>>> pretend
>>> to.  Its really these systems that we need to concentrate on talking
>>> to.  As far as clients that we develop, we have pretty much free 
>>> reign
>>> as long as we don't close anything off in the process.
>
> I'd second Chris on this one. We can't be everything to everyone. The
> logical conclusion of the first paragraph is to put everything in the
> database so however whoever/whatever touches it, they'll be seeing the
> real OME. I don't want to go there.

I thought you were all for representing things in the DB as much as 
possible.  I certainly am.

>
> And the implication of the second is that we need to support simple
> web-based clients. That's easy.
>
> I prefer and suggest a real server. If we can't let everyone in the
> world talk to it, tough nuggies. On the other hand, lots of people are
> working on interoperability -- I bet we can steal something. But I want
> a real programming language and tested/standardized/accepted components
> that I can use without having to do it myself. Enterprise people have
> being doing this for years; let's gain something from that fact.

A real programming language?  Standardized/tested/accepted components?  
By whom?
Enterprise people?
:)
I know its poison to your ears, but interoperability is achieved in the 
Enterprise using "real" programming languages like VBA, and tools like 
Excel, and SpotFire, and such things.

I am not advocating being everything to everyone.  I am simply 
advocating that we don't close off interfaces we already have.  
Anything else is wide open.  Go nuts, man!


>>> They're actually polling, aren't they?  The server doesn't notify the
>>> client that its time to refresh.  The client just does a refresh at 
>>> some
>>> point, and either it gets new data or it doesn't (i.e. polling).
>
> No. JMS is (can be) notification. You say "tell me about X" and it 
> tells
> you. No polling. Pubsub.

Right, I was talking about Harry's 1,2,3, but you were talking about 
the Java stuff.  Sorry.

>>> When we've asked these questions before, we've pretty consistently
>>> decided in favor of polling.  If you look around the web, that's 
>>> pretty
>>> much what everyone else decided as well.  Various forms of "push" 
>>> have
>>> sprouted up before and quickly died.  RSS readers are all universally
>>> polling.  Some of the news web-sites I visit do timed refresh, but
>>> that's just polling again.  Real notification over a stateless 
>>> protocol
>>> (like HTTP) would require the client to implement a server so that 
>>> the
>>> "real" server could issue requests on it, saying "Hey, time to
>>> update!".  There are a great many issues with a server initiating
>>> requests to a client machine.
>
> We all know what a fan I am of the W3C, but OME isn't (doesn't have to
> be) the web. That's really a side point. Chris will talk about ICE and
> I'll talk about J2EE, basically we're talking about typical systems 
> very
> different from the web (though a part of it) which make systems like
> ours work well.

And five or six years ago we were all talking about CORBA.  Its all 
good, and I'm all for it.  I just want a real simple way in and out 
without all the overhead.  Not instead of, but in addition to.


>>> Versioning introduces a whole host of new side-effects.  In effect, 
>>> all
>>> data in OME is already versioned because its attached to a MEX.  The
>>> self-consistent solution here is to enforce that all dataset changes
>>> (like adding or removing images) result in a MEX.  I don't think 
>>> there
>>> are any major side-effect issues with doing that.
>
> What does the "in effect" modifier in the above mean?

Its versioning without a serial version number.  If you're satisfied 
with a MEX ID as a version number, then its versioning.

>>> I don't like ad-hoq solutions either unless its the only way to avoid
>>> major side-effects.  I would prefer doing this with MEXes.  It would 
>>> be
>>> slightly harder to implement because it would likely introduce some
>>> minor side-effects, but I don't think it would be too bad.  The 
>>> pay-back
>>> for doing this is that it closes the penultimate loophole for 
>>> escaping
>>> the wrath of the MEX.  The last loophole is creation of the container
>>> objects themselves, which is also not done with a MEX.  Once those 
>>> two
>>> are closed, everything can become a semantic type.
>
> What are the side-effects you're talking about?

There are just some edge-cases to work through.  Previously the 
assumption was that a given dataset has a single 'version' of the 
images it contains.  Once somebody had something to say about the 
dataset (i.e. it acquired an 'attribute', and hence a MEX with that 
dataset as a target), the dataset was considered locked, and its image 
content could not be changed.

If images are mapped to datasets using an object tied to a MEX, it 
means that the collection of images in a dataset is 'versioned'.  This 
means that we don't ever have to lock datasets, but now we must 
identify the 'version' when we apply an analysis chain to it (or look 
at it in a viewer).


> Also, at least it terms of portraying this to the outside world, if
> eventually everything is a semantic type (==owl:class students.) and
> every state change gets a MEX even if not coming from the AE, I'd
> suggest renaming MEX because there's no "module" "executing" in many
> cases. This is partially what I'm talking about in stretching the idea.
> (below).

You don't have to think of a module as an algorithm either.  If you 
think of it as a construction that has inputs and outputs, one that can 
be "executed" to produce its outputs, then anything you do can be 
considered a module execution.  Please, lets not call it an 
instantiation or something.

>> Josh:
>> Can we stretch our idea of MEX to make this clean? This returns to my
>> comment about the mex_execution table being an "AuditLog", Ilya. If
>> things not strictly done by the AE could also be included in the
>> AuditLog, we're in business.
>
> Ilya:
>>> There's plenty of that going on.  Annotations aren't done with the 
>>> AE.
>>> Import is done partially with the AE and partially not.  The only
>>> requirement is that the state of the system after this is done is
>>> compatible with the AE (otherwise the AE won't be able to use this 
>>> info).
>
> So I assume those things aren't getting mexes, which make them 
> currently
> not queryable for refresh. That's where either MEX has to be stretched,
> or a super-concept introduced to handle these cases.

Uh, no.  They do get mexes, just not from the AE.  The MEX column in 
each attribute table now has a NOT NULL constraint, remember?
This is done with a bit of API that's targeted at import or annotation, 
etc.  This API registers that a certain module ("Import", "Annotation", 
etc.) executed and produced the recorded result.  Parts of the AE are 
indeed used to accomplish this, but it has to do with instantiating STs 
rather than executing an analysis chain, so its not done via the AE.  
The AE is used to do only one thing:  Execute an analysis chain against 
a dataset.
In principle, the AE can be used for things like this, but it was 
easier to do them through a specialized API rather than shoehorning 
this into the AE interface.

-Ilya



>
>
> - -Josh.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.5 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
> iD8DBQFCzD3YIwpkR5bKmAsRAk+2AJ9Qi90wlNc5YjJ3TVL3+tRMKUcsKwCdGYbP
> F1PXWonH/Y0ktYhFwM4C7qo=
> =VAK6
> -----END PGP SIGNATURE-----
>



More information about the ome-devel mailing list