[ome-devel] Bio-Formats build system and repository structure [was: Re: Error building Bio-Formats develop]

Melissa Linkert melissa at glencoesoftware.com
Tue Jun 11 18:52:05 BST 2013


Hi Curtis,

> That said, I think Bio-Formats would greatly benefit from substantial
> modularization of components. We are realizing this with SCIFIO, and I
> think it applies to the OME-XML component as well.
> 
> Below, I will lay out what I think is a better structure for the build
> system, which would result in more advantages and less pain than with the
> current structure.
...
> MetadataStore, MetadataRetrieve, etc., would move to the ome-xml component,
> keeping all generated code together.
> 
> One Git repository for each of:
> 
> - SCIFIO (https://github.com/scifio/scifio)
> - OME-XML (https://github.com/openmicroscopy/ome-xml)
> - Bio-Formats (https://github.com/openmicroscopy/bioformats)
> - Fork: Apache POI (https://github.com/openmicroscopy/ome-poi)
>  -- change package prefix to avoid third party code collisions
> - Fork: MDB Tools Java (https://github.com/openmicroscopy/ome-mdb-tools)
>  -- change package prefix to avoid third party code collisions
> - Fork: JAI Image I/O (https://github.com/scifio/scifio-jai-image-io)
>  -- change package prefix to avoid third party code collisions
> - Stub: LWF (https://github.com/scifio/lwf-stubs)

What you and Mark do for your work on SCIFIO is up to you, but I would
be extremely hesitant to do something like this for Bio-Formats itself.
Spreading one codebase across 7 different repositories is at best
invasive, and would have a substantial impact upon anyone who routinely
works on Bio-Formats.

> In other words, OME-XML gets its own Git repository, which includes all the
> code generated code. Each fork and stub also has its own repository in the
> relevant namespace.

Note that we've previously (and relatively recently) put quite a bit of
effort into getting the OME-XML code and specification into bioformats.git.

> Dependencies between repositories would be done by release version
> coupling. For Maven projects (i.e., SCIFIO), simply making releases and
> using release dependencies would be sufficient to facilitate repeatable
> builds. For Ant-based projects (i.e., stuff in openmicroscopy namespace),
> release JARs would continue to be committed to the repository as they are
> now, or they could be resolved remotely via Ivy or similar.

I don't agree that doing that makes things easier.  If we were to follow
this proposal, and unless I misunderstand, then making one simple change in
the OME-XML schema (e.g. changing a single field type) would require:

  - a pull request into whichever repository houses the specification
    (currently bioformats.git)
  - creation of "release" artifacts from whichever repository houses the
    specification
  - a pull request into ome-xml.git (to update the autogenerated code)
  - creation of release JARs from ome-xml.git
  - a pull request into scifio.git (to update SCIFIO readers)
  - creation of release JARs from scifio.git
  - a pull request into bioformats.git (to update Bio-Formats readers)

...instead of what we have now, which is a single pull request into
bioformats.git.

> This would making building Bio-Formats much simpler and faster. As Roger
> pointed out, we do not really need to code generate the OME-XML stuff on
> every build, but rather only when the schema changes. Of course, the
> OME-XML component contains other code which would be subject to change
> between schema releases, but that's fine.

I understand the desire for smarter autogeneration, but I think that
would be much better accomplished within our existing build systems,
rather than fragmenting the codebase.

> This more modular structure would also facilitate these components being
> developed on separate release cycles. The forks and stubs rarely change and
> do not need to be released with every OME release. And the OME-XML project
> could be released along side schema changes (i.e., twice a year) rather
> than with every OME release.

Our policy so far has been to release everything at once; I think it
would make more sense to agree first whether that should be changed, and
then consider solutions.  I personally do favor having everything be
released at once, as it makes keeping track of version numbers (mentally
and when supporting users) much easier.

Again, what you do with respect to http://github.com/scifio/scifio is up
to you.  Doing this for Bio-Formats itself would have a non-trivial impact
on every single OME team member and a large portion of the developer community,
and as such I think it would be better to consider other options for
making autogeneration easier.

Regards,
-Melissa

On Mon, Jun 10, 2013 at 10:50:25AM -0500, Curtis Rueden wrote:
> Hi Roger & everyone,
> 
> Sorry for the delay in reply. After spending the last couple of weeks on
> ImageJ build issues related to native code components (specifically, the
> ImageJ launcher in C), I have some new perspective on the new code
> generation of the Bio-Formats build system.
> 
> First of all, I want to say thanks to Roger for solving the build for both
> Ant and Maven. I know maintaining the dual build systems can be substantial
> extra work. But I think the Maven system has many advantages, so I am happy
> it is being maintained.
> 
> That said, I think Bio-Formats would greatly benefit from substantial
> modularization of components. We are realizing this with SCIFIO, and I
> think it applies to the OME-XML component as well.
> 
> Below, I will lay out what I think is a better structure for the build
> system, which would result in more advantages and less pain than with the
> current structure.
> 
> > One thing which might be an issue is that while xsd-fu generates the
> > ome-xml model code, which could potentially be downloaded, it also
> > generates all the MetadataStore, MetadateRetrieve and all the other
> > Metadata-related classes in scifio, including OMEXMLMetadataImpl.
> > Given that these are paired with the generated model code, generating
> > one and downloading the other may result in breakage on model changes,
> > or changes in xsd-fu or the templates which change the generated code.
> 
> MetadataStore, MetadataRetrieve, etc., would move to the ome-xml component,
> keeping all generated code together.
> 
> One Git repository for each of:
> 
> - SCIFIO (https://github.com/scifio/scifio)
> - OME-XML (https://github.com/openmicroscopy/ome-xml)
> - Bio-Formats (https://github.com/openmicroscopy/bioformats)
> - Fork: Apache POI (https://github.com/openmicroscopy/ome-poi)
>  -- change package prefix to avoid third party code collisions
> - Fork: MDB Tools Java (https://github.com/openmicroscopy/ome-mdb-tools)
>  -- change package prefix to avoid third party code collisions
> - Fork: JAI Image I/O (https://github.com/scifio/scifio-jai-image-io)
>  -- change package prefix to avoid third party code collisions
> - Stub: LWF (https://github.com/scifio/lwf-stubs)
> 
> In other words, OME-XML gets its own Git repository, which includes all the
> code generated code. Each fork and stub also has its own repository in the
> relevant namespace.
> 
> Dependencies between repositories would be done by release version
> coupling. For Maven projects (i.e., SCIFIO), simply making releases and
> using release dependencies would be sufficient to facilitate repeatable
> builds. For Ant-based projects (i.e., stuff in openmicroscopy namespace),
> release JARs would continue to be committed to the repository as they are
> now, or they could be resolved remotely via Ivy or similar.
> 
> This would making building Bio-Formats much simpler and faster. As Roger
> pointed out, we do not really need to code generate the OME-XML stuff on
> every build, but rather only when the schema changes. Of course, the
> OME-XML component contains other code which would be subject to change
> between schema releases, but that's fine.
> 
> This more modular structure would also facilitate these components being
> developed on separate release cycles. The forks and stubs rarely change and
> do not need to be released with every OME release. And the OME-XML project
> could be released along side schema changes (i.e., twice a year) rather
> than with every OME release.
> 
> Comments welcome.
> 
> Regards,
> Curtis
> 
> 
> On Thu, May 2, 2013 at 12:25 PM, Roger Leigh <r.leigh at dundee.ac.uk> wrote:
> 
> > On 02/05/2013 16:52, Curtis Rueden wrote:
> >
> >   > If so, the build is completely identical--the sources which get
> >>  > generated on the fly from the templates by xsd-fu are identical bar a
> >>  > few lines comments in the top  boilerplate.
> >>
> >> OK, good to know.
> >>
> >> One more question/concern: presumably, the Bio-Formats build no longer
> >> functions on Windows, due to the Python + Genshi dependency. With the
> >> Ant build, this might be non-trivial to solve. But solving the issue
> >> with Maven is very straightforward: include the "ome-xml" module in the
> >> reactor only within a profile. Then, when that profile is not enabled,
> >> Maven will resolve the ome-xml dependency from the remote repository
> >> rather than regenerating and rebuilding the code. This would eliminate
> >> the need to install Genshi, and make it easier to build on Windows
> >> again. What do you think?
> >>
> >
> > I'm afraid I'm no authority on Maven, so I'm not sure.  Maybe Melissa or
> > Josh have a better take on this than me.  I assume that this will work
> > correctly on Windows if python is installed?
> >
> > One thing which might be an issue is that while xsd-fu generates the
> > ome-xml model code, which could potentially be downloaded, it also
> > generates all the MetadataStore, MetadateRetrieve and all the other
> > Metadata-related classes in scifio, including OMEXMLMetadataImpl.  Given
> > that these are paired with the generated model code, generating one and
> > downloading the other may result in breakage on model changes,
> > or changes in xsd-fu or the templates which change the generated code.
> >
> > While it's not all enabled yet, I'd like to have the model selectable as
> > an ant properly (it's xsdfu.schemaver), so that it's possible to change
> > to a different model when building.  There's currently some hardcoded
> > "2012-06" versions which need to be switched to change to use the
> > property value.
> >
> >
> > Regards,
> > Roger
> >
> > --
> > Dr Roger Leigh -- Open Microscopy Environment
> > Wellcome Trust Centre for Gene Regulation and Expression,
> > College of Life Sciences, University of Dundee, Dow Street,
> > Dundee DD1 5EH Scotland UK   Tel: (01382) 386364
> >
> >
> > The University of Dundee is a registered Scottish Charity, No: SC015096
> >
> > ______________________________**_________________
> > ome-devel mailing list
> > ome-devel at lists.**openmicroscopy.org.uk<ome-devel at lists.openmicroscopy.org.uk>
> > http://lists.openmicroscopy.**org.uk/mailman/listinfo/ome-**devel<http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel>
> >

> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel



More information about the ome-devel mailing list