[ome-devel] Shoola-back-end consistency issues
Harry Hochheiser
hsh at nih.gov
Wed Jul 6 17:53:46 BST 2005
Ilya:
On Jul 6, 2005, at 12:07 PM, Ilya Goldberg wrote:
>>
>> The more I think about it, the harder this problem seems. Shoola &
>> OME-JAVA are essentially stateless. Agents request data and get back
>> chunks of it. There is no record of the state of any request on the
>> back-end, and no central manager in Shoola or OME-JAVA. Thus, for an
>> agent to verify that data is consistent, it must do all of the work
>> on its own. There are several solutions, each of which has some
>> difficulty:
>>
>> 1) Do nothing - give up on consistency. Easy, but obviously not ideal.
>
> Unfortunately, this is not even an option. We've fought long and hard
> for data consistency for the last five years. We're not giving up on
> it right at the client. That's pure insanity.
Fair enough. This was sort of a straw man of a suggestion. Also
reflects the current state of the world.
>> 2) Delay retrieval. This is currently done on the data manager. The
>> list of images in a dataset is not pre-loaded: instead, it is
>> retrieved when the icon for the dataset is expanded. This works
>> somewhat, but is incomplete: what if images are added while the icon
>> is expanded?
>>
>> 3) Refresh: The data manager handles this by providing "refresh"
>> buttons. These buttons work somewhat, but they are less than ideal.
>> A naive refresh operation that simply repeats a request will be
>> painfully inefficient when one image is added to a dataset of 10K
>> images, but anything more sophisticated will be tricky (more on this
>> in a bit).
>
> One of these two has to be implemented. Its perfectly fine that
> restarting an agent reloads its saved state even if its inconsistent.
> But a separate button must be provided that ensures that the agent is
> in a consistent state. Either the agent is completely stateless and
> gets its state only from the DB, or it must have a refresh button to
> resynchronize state. Even a stateless agent has state on the screen,
> so even these need a button to let the user ensure that what is
> displayed is what is in the DB.
>
I think it has to be a combination of the two. When appropriate,
don't load the info until it's available. Otherwise, do a refresh.
>> 4) Refresh might be improved by providing some information that
>> clients could use to decide when to reload data. Checking to see if
>> the size of the dataset has changed might indicate _if_ a change has
>> been made, but mechanisms for indicating _what_ has been changed
>> would be needed. The client might send back its list of images in the
>> dataset, and get back lists of images that had been added & deleted,
>> but this not be terribly efficient either. I think such information
>> would be necessary for this approach to be of use: what if 1 image
>> is added and one deleted? Simply checking changes in the size of a
>> dataset would not catch this case.
>>
>> 5) Josh suggested checking MEXs to look for MEX timestamps that are
>> newer than the most recent one, indicating a change to the dataset.
>> Unfortunately, changes to dataset and project contents do not appear
>> to generate MEXs. Any idea why this is? Shouldn't these actions have
>> MEXes?
>>
>> 6) We might add version/timestamp info to each row in tables like
>> datasets, projects, etc. This is too painful a thought for me to
>> contemplate at the moment.
>
> This is actually not that bad. If certain aspects of the container
> objects are modifiable without a MEX, then it would not be very
> difficult to add a standard suite of timestamps to them (last access,
> last modified). We don't need versioning for this. This can be done
> with DB rules in postgres, so it wouldn't necessarily even require any
> action by code to update these. But this is an ad-hoq solution to a
> specific problem: A way to check the container objects for things
> about them that can be modified without a MEX. Adding a new attribute
> to a dataset, for example, would mark it as "modified" just like
> adding a new image to it would. Or, we could add a separate last MEX
> timestamp (though that would be redundant).
Could we change code for modifying datasets, projects, etc. to generate
a MEX for each action? Would this do the trick?
more discussion of these alternatives may be necessary.
>>
>> 7) Another painful idea would involve building some data managers on
>> the client and back-end that would manage state. Perhaps by passing a
>> token around for different requests, these managers could provide a
>> generally useful and yet reasonably performant consistency mechanism.
>> Not having thought through the design, I'm not sure how this would
>> work, but I'd bet it could be done reasonably cleanly.
>
> Well, the UserState object could be used for this. It already
> maintains a timestamp of when it was last accessed. Hitting the
> refresh button when the session was not accessed since the last time
> it was refreshed wouldn't accomplish much. Or would it? Some other
> user with access to the dataset could have edited its image content.
> Remember that other users (and even the same users) have access to the
> same DB through other means. Consistency can't be ensured by a single
> client, or by any token bound to a single client.
Where is this object?
the token may be necessary, but not sufficient, for consistency. If
the client and the back-end maintain some record of the state of a
query, the client could send the token to the back-end. The back-end
could then re-run the query, compare it to previous results to say what
had changed, and send an appropriately-packaged response to the client.
The token is simply a means to refer to the query that is being
specified, and results for prior invocations of that query.
>
>>
>> Thoughts/ideas/responses? Unless anyone has any better ideas that are
>> not too painful, I think it's safe to say that some combination of
>> 1,2, & 3 will continue to be the status quo.
>
> Agents must have a way for the user to ensure that they are
> synchronized with the back-end. As painful and as slow as that can
> be, this functionality has to exist.
got it. Given the importance of this goal, we should talk concretely
about how best to achieve it.
harry
More information about the ome-devel
mailing list