[ome-devel] Question about the new OME::Remote API

Mon Aug 2 17:18:07 BST 2004

Hey there, that was code which I wrote, so I'll take a stab at
answering your questions.  (I'm also going to put most of this into
the code itself as POD/Javadoc documentation, and maybe write up a
little article summarizing this for the docs site.)

I'm going to go ahead and start with some intro, to make sure we're
all on the same page.

So, to begin with, the OME::Facades::GenericFacade class defines a
couple of methods which you can call via the remote framework's
dispatch function.  These methods are used to retrieve database
objects from the OME data server, and return them in such a way that
they can be used by a client application.  These database objects can
be core types (projects, datasets, images, etc.), or attributes of a
semantic type (Pixels, PlaneMean, Plate, etc.).  It's important to
realize that this is *not* trying to provide a remote object platform,
à la CORBA or DCOM or ICE or whatever.  The data hash you get back
does not correspond to any kind of persistent server-side object; it's
just a representation of data coming from the database.

The Shoola app was built around a paradigm of "retrieve exactly as
much information you need to display something, and no more".  There
are two features of the generic data façade which facilitate this:

1) The fields_wanted parameter allows you to only retrieve certain
fields for the objects that are returned.  In the case where you're
retrieving a large list of objects with many fields, but you know that
you only need to use two of them, this can save a lot of network
bandwidth overhead.

2) You can retrieve a tree of objects in one method call.  This tree
is defined by the database-level relationships between the objects'
tables.  So, for instance, you could retrieve a project, and all of
the datasets that are in that project, and all of the images in each
dataset, all with a single remote call.

The two features end up being dependent on each other.  For instance,
since you can retrieve a tree of objects, the fields_wanted parameter
has to be a hash instead of a list, since you have to specify the
fields you want for each level of the tree that you're going to get
back.  Also, it is the fields_wanted parameter which defines the tree,
since the tree is extended a level when you say that you "want" a
field which happens to be a reference in the database.

So, to run with the example from above, let's say we're retrieving a
project from the database, we might create a fields_wanted parameter
like this:

{
  '.' => ['id','name']
}

I'm using Perl syntax for the hashes; in Java this is a Map (which is
encapsulated into the FieldsSpecification class for cleanliness's
sake).  In your C API, Zach, it would be some data structure which can
be transformed into the hash/dictionary XML-RPC type.

Anyway, back to the example.  The hash above returns only the primary
key ID and name of the project you retrieve.  The period for the
hash's single key represents the object(s) at the root of the tree
being returned, which in our case, is a project.

To extend this, let's say that we want to also retrieve the list of
datasets in this project.  This is fairly straightforward:

{
  '.' => ['id','name','datasets']
}

This will ensure that the data hash for the project which is returned
has a "datasets" entry, which will contain the project's datasets.
However, since this is a reference, we have to also specify which
fields we want the dataset objects to contain, like so:

{
  '.' => ['id','name','datasets'],
  'datasets' => ['id','name']
}

The value for the key in the fields_wanted hash is the sequence of
fields that you need to follow to get to the corresponding level of
the tree.  So, for instance, we can also retrieve the images in the
project's datasets like this:

{
  '.' => ['id','name','datasets'],
  'datasets' => ['id','name','images'],
  'datasets.images' => ['id','name','created']
}

This is the format that is expected for the fields_wanted parameter.
You are allowed to leave an entry out (for instance, not including the
'datasets.images' entry even though 'images' is in the entry for
'datasets').  This is not recommended, however.  The default entry is
['id','name','granularity'] for a semantic type, and ['id'] for
everything else.

Now, to answer Zach's specific questions:

1) The massaging is just to get the hash keys into a format which is
easier for the generic data code to parse.  The massaged version of
the above hash is:

{
  '' => ['id','name','datasets'],
  '.datasets' => ['id','name','images'],
  '.datasets.images' => ['id','name','created']
}

Note that not much changes.  The period representing the root of the
tree becomes an empty string, and all of the other keys have a period
prepended to them.  There are two syntaxes only because the first
seemed like it would be more intuitive to a person, while the second
was easier to parse by a computer.  (Start with an empty string, and
add ".<fieldname>" whenever you follow a reference.)

2) The generic data code recognizes an ":all:" field which fills in
the data hash completely:

{
  '.' => [':all:']
}

However, this syntax should not be used to promote laziness.  It
should only be used if your client needs to have all of the fields of
an object, and does not know at compile time what those fields are.
(UI's which display the contents of any attribute, regardless of
semantic type, fall into this category.)  A good rule of thumb is that
if you're going to be retrieving hard-coded fields from your return
value, you should be putting hard-coded values into your fields_wanted
hash.

One last point about the DTO which you receive:  If a field does not
exist in the DTO hash which is returned, this does *not* signify a
null value in the database.  Rather, it signifies that you did not ask
for that field.  In general, you cannot make any assumptions about
fields that you do not ask for.

I think that's about it as far as reading DTO's is concerned.
(Writing them is for another email :) .)

--doug

------------------------------------------
Douglas Creager <dcreager at alum.mit.edu>
Visiting Scholar, MIT Dept. of Biology
77 Mass Ave, Rm 68-371, Cambridge MA 02139
M: +1 617/501-8340
W: +1 617/452-2955