[ome-devel] Bio-Formats build system and repository structure [was: Re: Error building Bio-Formats develop]

Curtis Rueden ctrueden at wisc.edu
Wed Jul 10 19:42:46 BST 2013


Hi Melissa,

> I have opened a story to investigate general build system and
> versioning changes and potentially adding e.g. mdbtools.git to
> github.com/openmicroscopy:
>
> http://trac.openmicroscopy.org.uk/ome/ticket/11228

Thank you for creating these tickets! I have written a few comments. For
anyone else interested: any further discussion on these issues can be found
by following that link.

Regards,
Curtis


On Thu, Jul 4, 2013 at 12:06 PM, Melissa Linkert <
melissa at glencoesoftware.com> wrote:

> Hi Curtis,
>
> > I do not see it as one codebase, but rather several projects which all
> > currently happen to be lumped into one repository with a single release
> > schedule imposed upon them. Right now, all of these project releases are
> > dictated by the OMERO release schedule [3]. The OME team is ramping up
> for
> > the OMERO 5 release, and presumably a Bio-Formats 5 release will go along
> > with that. But what is really so radically new in Bio-Formats 5?
> Nothing, I
> > would argue. Now, e.g., an extensible SCIFIO-based Bio-Formats would
> > justify a 5.0.0 release. But instead, I am sad to see Bio-Formats
> versions
> > that have little semantic meaning [4] with respect to Bio-Formats itself,
> > all because of the simultaneous top-down release schedule driven by the
> > OMERO project.
> >
> > Instead, my proposal emphasizes the individual projects as useful to the
> > community and the world in their own right. For example, MDB Tools Java
> is
> > not available anywhere else to my knowledge [5], since we rescued it from
> > an obscure forum post. Wouldn't it be a wonderful service to the
> community
> > to post it to its own Git repository on GitHub, so that it is easy to
> find,
> > and the greater community (beyond just those interested in OME) might
> > contribute back?
> >
> > I understand I am proposing a substantial change. It is certainly about
> > more than just improving the code generation process. That's why I
> changed
> > the subject. And I understand the reluctance to change anything that
> might
> > disrupt current development processes. But in this case, I think the
> change
> > would be well worth it to make the OME projects into even better members
> of
> > the global FOSS community.
>
> I can see the value in doing this for some of the components (such as
> components/forks/*), but doing this for every single component would have
> a negative impact on the workflow of the OME team without a clear benefit.
>
> I have opened a story to investigate general build system and versioning
> changes and potentially adding e.g. mdbtools.git to
> github.com/openmicroscopy:
>
> http://trac.openmicroscopy.org.uk/ome/ticket/11228
>
> All of the tasks under that story are fairly significant undertakings
> though, and as such I have not yet assigned a milestone.
>
> Regards,
> -Melissa
>
> On Fri, Jun 14, 2013 at 04:52:08PM -0500, Curtis Rueden wrote:
> > Hi Melissa & everyone,
> >
> > Thank you for replying to my suggestion. I appreciate the discussion.
> >
> > > Note that we've previously (and relatively recently) put quite a bit
> > > of effort into getting the OME-XML code and specification into
> > > bioformats.git.
> >
> > Yes, you might recall that I am the one who did most of that work. ;-)
> [1]
> >
> > And I would be willing to do it again, if it meant a cleaner build
> system.
> > (Though splitting subtrees is much less complicated.)
> >
> > > If we were to follow this proposal, and unless I misunderstand, then
> > > making one simple change in the OME-XML schema (e.g. changing a single
> > > field type) would require:
> >
> > Well, I think specification and ome-xml should live in the same
> repository.
> >
> > So it would require:
> > - filing a PR into the specification repo (ome-xml.git in my proposal)
> > - cutting a new ome-xml release artifact
> > - filing a PR into bioformats.git to update the version of ome-xml
> >
> > So, two PRs instead of one. (As Mark said, scifio.git doesn't depend on
> > ome-xml.) But the schema and ome-xml code doesn't change nearly as often
> as
> > Bio-Formats does, of course.
> >
> > Still, I can appreciate how two PRs might be undesirable (believe me; you
> > know how much I hate the develop/dev_4_4 split :-). To avoid that, you
> > could keep specification & ome-xml & bio-formats in the same git
> repository
> > as they are now, while still using separate versioning per component.
> This
> > would provide on target for PRs, while making it much simpler for most
> > people to build Bio-Formats thanks to remote release dependency
> resolution
> > of Maven/Ivy/etc.
> >
> > > Our policy so far has been to release everything at once; I think it
> > > would make more sense to agree first whether that should be changed,
> > > and then consider solutions.  I personally do favor having everything
> > > be released at once, as it makes keeping track of version numbers
> > > (mentally and when supporting users) much easier.
> >
> > Indeed, my proposal hinges on the idea that everything should actually
> not
> > be released at once -- that many of these are separate projects, which
> are
> > A) useful in their own right, outside an OMERO-specific context; and B)
> > developed at a much different pace than one another, resulting in
> > gratuitous/unnecessary/confusing releases when all continue to be
> versioned
> > together. [2]
> >
> > > I understand the desire for smarter autogeneration, but I think that
> > > would be much better accomplished within our existing build systems,
> > > rather than fragmenting the codebase.
> >
> > I do not see it as one codebase, but rather several projects which all
> > currently happen to be lumped into one repository with a single release
> > schedule imposed upon them. Right now, all of these project releases are
> > dictated by the OMERO release schedule [3]. The OME team is ramping up
> for
> > the OMERO 5 release, and presumably a Bio-Formats 5 release will go along
> > with that. But what is really so radically new in Bio-Formats 5?
> Nothing, I
> > would argue. Now, e.g., an extensible SCIFIO-based Bio-Formats would
> > justify a 5.0.0 release. But instead, I am sad to see Bio-Formats
> versions
> > that have little semantic meaning [4] with respect to Bio-Formats itself,
> > all because of the simultaneous top-down release schedule driven by the
> > OMERO project.
> >
> > Instead, my proposal emphasizes the individual projects as useful to the
> > community and the world in their own right. For example, MDB Tools Java
> is
> > not available anywhere else to my knowledge [5], since we rescued it from
> > an obscure forum post. Wouldn't it be a wonderful service to the
> community
> > to post it to its own Git repository on GitHub, so that it is easy to
> find,
> > and the greater community (beyond just those interested in OME) might
> > contribute back?
> >
> > I understand I am proposing a substantial change. It is certainly about
> > more than just improving the code generation process. That's why I
> changed
> > the subject. And I understand the reluctance to change anything that
> might
> > disrupt current development processes. But in this case, I think the
> change
> > would be well worth it to make the OME projects into even better members
> of
> > the global FOSS community.
> >
> > Regards,
> > Curtis
> >
> > [1] http://trac.openmicroscopy.org.uk/ome/ticket/10435#comment:8
> >
> > [2] One quick example of how the gratuitous releases cause problems:
> > whenever I update Bio-Formats using the ImageJ updater, it notices that
> > several of the JARs actually have no functional changes, and does not
> alter
> > the version number of the JAR files. Yes, we could potentially fix this
> in
> > the updater, but wouldn't it make more sense to simply not make vacuous
> > releases in the first place?
> >
> > [3] Note that we are not even that consistent, since there are several
> > other projects we develop like native-lib-loader which are *not*
> > synchronized with the OMERO/Bio-Formats version. This works just fine,
> > since we treat those projects as external dependencies. The same thing
> > would work fine for the components/forks, components/stubs and ome-xml.
> >
> > [4] ImageJ2 and SCIFIO are now using SemVer, which conveys useful
> > information in the version number: http://semver.org/
> >
> > [5] Well, potentially this: http://jackcess.sourceforge.net/
> >
> >
> > On Tue, Jun 11, 2013 at 12:52 PM, Melissa Linkert <
> > melissa at glencoesoftware.com> wrote:
> >
> > > Hi Curtis,
> > >
> > > > That said, I think Bio-Formats would greatly benefit from substantial
> > > > modularization of components. We are realizing this with SCIFIO, and
> I
> > > > think it applies to the OME-XML component as well.
> > > >
> > > > Below, I will lay out what I think is a better structure for the
> build
> > > > system, which would result in more advantages and less pain than
> with the
> > > > current structure.
> > > ...
> > > > MetadataStore, MetadataRetrieve, etc., would move to the ome-xml
> > > component,
> > > > keeping all generated code together.
> > > >
> > > > One Git repository for each of:
> > > >
> > > > - SCIFIO (https://github.com/scifio/scifio)
> > > > - OME-XML (https://github.com/openmicroscopy/ome-xml)
> > > > - Bio-Formats (https://github.com/openmicroscopy/bioformats)
> > > > - Fork: Apache POI (https://github.com/openmicroscopy/ome-poi)
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Fork: MDB Tools Java (
> https://github.com/openmicroscopy/ome-mdb-tools)
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Fork: JAI Image I/O (https://github.com/scifio/scifio-jai-image-io
> )
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Stub: LWF (https://github.com/scifio/lwf-stubs)
> > >
> > > What you and Mark do for your work on SCIFIO is up to you, but I would
> > > be extremely hesitant to do something like this for Bio-Formats itself.
> > > Spreading one codebase across 7 different repositories is at best
> > > invasive, and would have a substantial impact upon anyone who routinely
> > > works on Bio-Formats.
> > >
> > > > In other words, OME-XML gets its own Git repository, which includes
> all
> > > the
> > > > code generated code. Each fork and stub also has its own repository
> in
> > > the
> > > > relevant namespace.
> > >
> > > Note that we've previously (and relatively recently) put quite a bit of
> > > effort into getting the OME-XML code and specification into
> bioformats.git.
> > >
> > > > Dependencies between repositories would be done by release version
> > > > coupling. For Maven projects (i.e., SCIFIO), simply making releases
> and
> > > > using release dependencies would be sufficient to facilitate
> repeatable
> > > > builds. For Ant-based projects (i.e., stuff in openmicroscopy
> namespace),
> > > > release JARs would continue to be committed to the repository as
> they are
> > > > now, or they could be resolved remotely via Ivy or similar.
> > >
> > > I don't agree that doing that makes things easier.  If we were to
> follow
> > > this proposal, and unless I misunderstand, then making one simple
> change in
> > > the OME-XML schema (e.g. changing a single field type) would require:
> > >
> > >   - a pull request into whichever repository houses the specification
> > >     (currently bioformats.git)
> > >   - creation of "release" artifacts from whichever repository houses
> the
> > >     specification
> > >   - a pull request into ome-xml.git (to update the autogenerated code)
> > >   - creation of release JARs from ome-xml.git
> > >   - a pull request into scifio.git (to update SCIFIO readers)
> > >   - creation of release JARs from scifio.git
> > >   - a pull request into bioformats.git (to update Bio-Formats readers)
> > >
> > > ...instead of what we have now, which is a single pull request into
> > > bioformats.git.
> > >
> > > > This would making building Bio-Formats much simpler and faster. As
> Roger
> > > > pointed out, we do not really need to code generate the OME-XML
> stuff on
> > > > every build, but rather only when the schema changes. Of course, the
> > > > OME-XML component contains other code which would be subject to
> change
> > > > between schema releases, but that's fine.
> > >
> > > I understand the desire for smarter autogeneration, but I think that
> > > would be much better accomplished within our existing build systems,
> > > rather than fragmenting the codebase.
> > >
> > > > This more modular structure would also facilitate these components
> being
> > > > developed on separate release cycles. The forks and stubs rarely
> change
> > > and
> > > > do not need to be released with every OME release. And the OME-XML
> > > project
> > > > could be released along side schema changes (i.e., twice a year)
> rather
> > > > than with every OME release.
> > >
> > > Our policy so far has been to release everything at once; I think it
> > > would make more sense to agree first whether that should be changed,
> and
> > > then consider solutions.  I personally do favor having everything be
> > > released at once, as it makes keeping track of version numbers
> (mentally
> > > and when supporting users) much easier.
> > >
> > > Again, what you do with respect to http://github.com/scifio/scifio is
> up
> > > to you.  Doing this for Bio-Formats itself would have a non-trivial
> impact
> > > on every single OME team member and a large portion of the developer
> > > community,
> > > and as such I think it would be better to consider other options for
> > > making autogeneration easier.
> > >
> > > Regards,
> > > -Melissa
> > >
> > > On Mon, Jun 10, 2013 at 10:50:25AM -0500, Curtis Rueden wrote:
> > > > Hi Roger & everyone,
> > > >
> > > > Sorry for the delay in reply. After spending the last couple of
> weeks on
> > > > ImageJ build issues related to native code components (specifically,
> the
> > > > ImageJ launcher in C), I have some new perspective on the new code
> > > > generation of the Bio-Formats build system.
> > > >
> > > > First of all, I want to say thanks to Roger for solving the build for
> > > both
> > > > Ant and Maven. I know maintaining the dual build systems can be
> > > substantial
> > > > extra work. But I think the Maven system has many advantages, so I am
> > > happy
> > > > it is being maintained.
> > > >
> > > > That said, I think Bio-Formats would greatly benefit from substantial
> > > > modularization of components. We are realizing this with SCIFIO, and
> I
> > > > think it applies to the OME-XML component as well.
> > > >
> > > > Below, I will lay out what I think is a better structure for the
> build
> > > > system, which would result in more advantages and less pain than
> with the
> > > > current structure.
> > > >
> > > > > One thing which might be an issue is that while xsd-fu generates
> the
> > > > > ome-xml model code, which could potentially be downloaded, it also
> > > > > generates all the MetadataStore, MetadateRetrieve and all the other
> > > > > Metadata-related classes in scifio, including OMEXMLMetadataImpl.
> > > > > Given that these are paired with the generated model code,
> generating
> > > > > one and downloading the other may result in breakage on model
> changes,
> > > > > or changes in xsd-fu or the templates which change the generated
> code.
> > > >
> > > > MetadataStore, MetadataRetrieve, etc., would move to the ome-xml
> > > component,
> > > > keeping all generated code together.
> > > >
> > > > One Git repository for each of:
> > > >
> > > > - SCIFIO (https://github.com/scifio/scifio)
> > > > - OME-XML (https://github.com/openmicroscopy/ome-xml)
> > > > - Bio-Formats (https://github.com/openmicroscopy/bioformats)
> > > > - Fork: Apache POI (https://github.com/openmicroscopy/ome-poi)
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Fork: MDB Tools Java (
> https://github.com/openmicroscopy/ome-mdb-tools)
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Fork: JAI Image I/O (https://github.com/scifio/scifio-jai-image-io
> )
> > > >  -- change package prefix to avoid third party code collisions
> > > > - Stub: LWF (https://github.com/scifio/lwf-stubs)
> > > >
> > > > In other words, OME-XML gets its own Git repository, which includes
> all
> > > the
> > > > code generated code. Each fork and stub also has its own repository
> in
> > > the
> > > > relevant namespace.
> > > >
> > > > Dependencies between repositories would be done by release version
> > > > coupling. For Maven projects (i.e., SCIFIO), simply making releases
> and
> > > > using release dependencies would be sufficient to facilitate
> repeatable
> > > > builds. For Ant-based projects (i.e., stuff in openmicroscopy
> namespace),
> > > > release JARs would continue to be committed to the repository as
> they are
> > > > now, or they could be resolved remotely via Ivy or similar.
> > > >
> > > > This would making building Bio-Formats much simpler and faster. As
> Roger
> > > > pointed out, we do not really need to code generate the OME-XML
> stuff on
> > > > every build, but rather only when the schema changes. Of course, the
> > > > OME-XML component contains other code which would be subject to
> change
> > > > between schema releases, but that's fine.
> > > >
> > > > This more modular structure would also facilitate these components
> being
> > > > developed on separate release cycles. The forks and stubs rarely
> change
> > > and
> > > > do not need to be released with every OME release. And the OME-XML
> > > project
> > > > could be released along side schema changes (i.e., twice a year)
> rather
> > > > than with every OME release.
> > > >
> > > > Comments welcome.
> > > >
> > > > Regards,
> > > > Curtis
> > > >
> > > >
> > > > On Thu, May 2, 2013 at 12:25 PM, Roger Leigh <r.leigh at dundee.ac.uk>
> > > wrote:
> > > >
> > > > > On 02/05/2013 16:52, Curtis Rueden wrote:
> > > > >
> > > > >   > If so, the build is completely identical--the sources which get
> > > > >>  > generated on the fly from the templates by xsd-fu are identical
> > > bar a
> > > > >>  > few lines comments in the top  boilerplate.
> > > > >>
> > > > >> OK, good to know.
> > > > >>
> > > > >> One more question/concern: presumably, the Bio-Formats build no
> longer
> > > > >> functions on Windows, due to the Python + Genshi dependency. With
> the
> > > > >> Ant build, this might be non-trivial to solve. But solving the
> issue
> > > > >> with Maven is very straightforward: include the "ome-xml" module
> in
> > > the
> > > > >> reactor only within a profile. Then, when that profile is not
> enabled,
> > > > >> Maven will resolve the ome-xml dependency from the remote
> repository
> > > > >> rather than regenerating and rebuilding the code. This would
> eliminate
> > > > >> the need to install Genshi, and make it easier to build on Windows
> > > > >> again. What do you think?
> > > > >>
> > > > >
> > > > > I'm afraid I'm no authority on Maven, so I'm not sure.  Maybe
> Melissa
> > > or
> > > > > Josh have a better take on this than me.  I assume that this will
> work
> > > > > correctly on Windows if python is installed?
> > > > >
> > > > > One thing which might be an issue is that while xsd-fu generates
> the
> > > > > ome-xml model code, which could potentially be downloaded, it also
> > > > > generates all the MetadataStore, MetadateRetrieve and all the other
> > > > > Metadata-related classes in scifio, including OMEXMLMetadataImpl.
> > >  Given
> > > > > that these are paired with the generated model code, generating
> one and
> > > > > downloading the other may result in breakage on model changes,
> > > > > or changes in xsd-fu or the templates which change the generated
> code.
> > > > >
> > > > > While it's not all enabled yet, I'd like to have the model
> selectable
> > > as
> > > > > an ant properly (it's xsdfu.schemaver), so that it's possible to
> change
> > > > > to a different model when building.  There's currently some
> hardcoded
> > > > > "2012-06" versions which need to be switched to change to use the
> > > > > property value.
> > > > >
> > > > >
> > > > > Regards,
> > > > > Roger
> > > > >
> > > > > --
> > > > > Dr Roger Leigh -- Open Microscopy Environment
> > > > > Wellcome Trust Centre for Gene Regulation and Expression,
> > > > > College of Life Sciences, University of Dundee, Dow Street,
> > > > > Dundee DD1 5EH Scotland UK   Tel: (01382) 386364
> > > > >
> > > > >
> > > > > The University of Dundee is a registered Scottish Charity, No:
> SC015096
> > > > >
> > > > > ______________________________**_________________
> > > > > ome-devel mailing list
> > > > > ome-devel at lists.**openmicroscopy.org.uk<
> > > ome-devel at lists.openmicroscopy.org.uk>
> > > > > http://lists.openmicroscopy.**org.uk/mailman/listinfo/ome-**devel<
> > > http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel>
> > > > >
> > >
> > > > _______________________________________________
> > > > ome-devel mailing list
> > > > ome-devel at lists.openmicroscopy.org.uk
> > > > http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
> > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20130710/258eb071/attachment-0001.html>


More information about the ome-devel mailing list