Tuesday, 10 June 2008

To Be(*) Or Not To Be(*) That Is The Question

(*) Included in an API bundle.

There's been a lively debate on the OSGi mailing list over the past couple of weeks surrounding the issue of whether an API should be packaged with it's implementation in one bundle or whether it should be split up into two bundles (one for API and one for implementation).

I think it's fair to say there are a range of opinions on the subject. What is clear however is that there is no one size fits all answer to this question. As a developer or architect you need to consider the use case your API and implementation are going to be put to in order to make the "correct" decision.

The fundamental issue that everyone involved in the discussion agreed on is that using a bundle should be as simple as possible. The difference of opinions comes in depending on what measure of simplicity you take to be most important. I think there are three main themes that have so far been discussed:
  • installation simplicity - minimizing the number of bundles needed to run an OSGi application
  • runtime simplicity - minimizing the connectivity between bundles in a running OSGi application.
  • architectural simplicity - minimizing duplicated packages between bundles used to build an OSGi application
These three goals are often (but not always) in conflict and it depends on your use case which is the most important for you. In order understand where the conflict arises it is important to understand how OSGi classloading works especially with regard to installation and resolution of bundles.

I'll try to give a quick overview.

A bundle can contain classes (like any standard Java jar). However it can also import classes from other bundles via a number of different import mechanisms. Importing classes allows bundles to be modularized such that different bundles provided different sets of functionality but still declare their dependencies in order to run. In order to use classes from a bundle all import requirements must be satisfied. In order to satisfy an import requirement a bundle exporting that requirement must be installed in the OSGi runtime.

So if BundleA depends on classes exported from BundleB you have to install both BundleA and BundleB in order to use BundleA. If BundleB depends on classes from another bundle you need to install those bundles too. You can probably see that this problem can rapidly diverge from being a single bundle install to being a large complicated set of bundle installs.

Another important detail to be considered is that in OSGi it is possible for many bundles to export the same API package and only one will be picked by the runtime to actually provide the classes.

When we discuss splitting API and implementation we are saying that one bundle will export the API classes and another bundle will import that API and provide an implementation for it. When we talk about grouping API and implementation we mean that the API will be exported from the same bundle that provides the implementation. Many implementations may export the same API but this is OK in an OSGi framework.

My own advice would be to start by assuming that API and implementation are packaged in separate bundles. The reasoning behind this is based on the following criteria:
  • In general an implementation is likely to depend on more packages than it's API
  • You can always collapse back to one bundle later if you use import-package vs require-bundle
  • If you use a provisioning mechanism such as Newton or P2 (when it's ready) downloading two bundles vs one is handled automagically
The benefits of splitting API and implementation are the following:
  • If you are considering making your application distributed or want it to run in a constrained environment you can install the API without having to resolve the implementation dependencies (possibly a big deal in a client/server architecture)
  • If you want to restart or reinstall the implementation bundle then this doesn't automatically mean the restart of all client bundles that are using the API from that bundle
  • OSGi export-package has a uses attribute - to specify classes that the exported API has gained from imports. It is possible for combinations of exports and imports to cause bundles to be mutually exclusive - such that it is impossible to close the graph and install all bundles in the same OSGi runtime at the same time. Limiting the connectivity in the graph via separating API from implementation reduces the risk of running into this problem.
If you start by assuming the API and implementation are separate then you can use the following logic to assess whether you can condense them back to one bundle for the purposes of your architecture:
  1. Start by designing your API to depend on as little as possible.
  2. Make your implementation depend on API packages and any other dependencies it needs to function.
  3. If after doing this the implementation depends only on API consider whether the implementation is ever likely to get more complicated.
  4. If it isn't then you can probably collapse back to one bundle.
Of course this can always be done prior to building any bundles if you are good at modelling in you head or on paper etc.

Hopefully that's helped some people understand the issues.

Laters

No comments: