Sigfrid Lundberg's Stuff 2010-04-21

Bookmark and Share

Carl Lagoze, Sandy Payette, Edwin Shin and Chris Wilper, 2005. Fedora: an architecture for complex objects and their relationships. International Journal on Digital Libraries.

This entry should have been about the paper mentioned above. It is only remotely related to it, I'm afraid.

We are about to start implementing Fedora here at the Royal Library; our journey to Stanford was about about that. The decision has been discussed at length and I think that it is wise, because when you build a repository of digital objects you need a uniform environment for metadata processing.

Here I'm going to discuss XML processing, not metadata processing. Some may think Fedora isn't a tool for XML processing. However, it is often used that way.

One caveat, the thoughts here are the opinions of one who have not yet used the package. However

  1. I have read the writings by Carl Lagoze et al. and also
  2. the information on the Fedora Commons web site.
  3. In addition I've had quite a few conversations with people involved with Fedora, including Carl himself at the turn of the century when he was very much involved in the project.

So if you are willing to consider my predilections, read on. From my view, the use of Fedora is connected with some pitfalls. They are as dangerous as they are easy to circumvent.

Stale XML technology stack

In my view Fedora is basically JAXP & Web technologies built on top of standards such as WSDL, SOAP & XSD. It is not even a little anno dazumal, to me it is very much the obvious technology stack of year 2001. Today it is REST not SOAP, RelaxNG not XSD. Also we have now XQuery and other technologies that were just not invented anno dazumal.

Since Fedora Commons 3.0 the entire Fedora API should be available through REST. XSD still sucks (read this as well), but I can live with it.

Poorly standardized container format

All objects in Fedora are ingested by formulating your data in Fedora Object XML (FOXML), which is the format used internally in Fedora. Tim Bray (editor of the spec of XML 1.0) gives the following advice to anyone designing mark-up languages:

Accept that there will be a clash between the model-centric and syntax-centric world-views, but bear in mind that successful XML-based languages support the use of multiple software implementations that cannot be expected to share a data model. (Tim Bray, 2005)

Seen from this perspective FOXML is not a good idea, being a typical object-oriented serialization used in one single implementation only, with one single datamodel. FOXML is poorly documented and usually used as a container for home brewed mixture of XML or RDF. That is, it has all characteristics of a loser language.

Now, here I here we have to put this into perspective. If we look upon FOXML as just a serialized object and not as the document format that will preserve our valuable data for posterity, then FOXML is just another internal data format used inside a piece of software. Then I can live with FOXML. Indeed people invent such XML languages all the time to get their job done, and in comparison with those, then FOXML isn't that bad. In an object store with multiple document types this is almost the only way to do it, unless you've got an XML database.

Performance problems

The FOXML is usually indexed using a RDF triple store. However, the indexing process is computationally demanding and many implementors switch off this feature. A popular complement (or even replacement) is to index the data using Solr, which is a high performance search and retrieval engine based on Apache Lucene.

I can live with the indexing in Lucene or Solr. They are good.

In conclusion

I haven't even mentioned all the object oriented bindings. However, bindings can never be better than what they bind to. And that is FOXML and it sucks. It is just a poor quality language in comparison with TEI, DIDL, METS, MODS and ATOM. It is good enough for a lot of purposes, though. Such as presenting the content of a METS file. You can do wonders with Fedora. But remember FOXML can never replace METS even if the Fedora people are better at UML. The METS community is better at XML.

This entry is part of my series Readings on digital objects

blog comments powered by Disqus


Subscribe to Stuff from Sigfrid LundbergSubscribe to my stuff
Subscribe to Stuff from Sigfrid LundbergSubscribe to discussion feed

stuff by category || year


My name is Sigfrid Lundberg. The stuff I publish here may, or may not, be of interest for anyone else.

On this site there is material on photography, music, literature and other stuff I enjoy in life. However, most of it is related to my profession as an Internet programmer and software developer within the area of digital libraries at the Royal Library, Copenhagen (Denmark) and, before that, Lund university (Sweden).

The content here does not reflect the views of my past or present employers

Creative Commons License
This entry (Fedora) within Sigfrid Lundberg's Stuff, by Sigfrid Lundberg is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.