A quotation is much more than an extract

Sigfrid Lundberg's Stuff 2009-08-21

Bookmark and Share

'We quote,' writes Isaac D'Israeli, 'to save proving what has been demonstrated, referring to where the proofs may be found.' How much wisdom isn't embedded in this single line! If we did not quote, then we had to explore or invent everything anew ourselves. We do need methods of referring our readers to the source. Worldwide web provides eminent facilities for this, such as the hypertext link.

With a quote we can make our lives as writers more comfortable. We can, as D'Israeli continues, 'screen ourselves from the odium of doubtful opinions, which the world would not willingly accept from ourselves'. How wonderful! I can thus market ideas that noone would find credible when coming from me, or refer to ones that I would not myself dare to publicize. The actual stylistic use of referencing belong to the art of writing, which is distinct from the art of hypertext.

Google books provide writers with exellent facilities for hypertext linking. For instance, you can cut a piece of text and put into your own page. Like this:

Quotation, like much better things, has its abuses. One may quote till one compiles.
Quotation like much better things has its abuses One may quote till one compiles.

There are dangers of quoting, as D'Israeli makes clear. When you you cut and paste too much, you're no longer authoring. You're compiling.

The annotation anchor

If you've followed my links above you've seen how one can link to a page in a text, how we can highlight an area in an image and also quote by cutting a snippet out of an image. The single points, areas or ranges in an object which can be used for referencing is in my world called annotation anchors.

In text we anchor a reference by using a unique sequence of tokens. In this particular text, unique sequence of tokens could have served the purpose, if I hadn't it repeated here. The position of a footnote could have been persistently anchored just by that sequence. In hypertext, we typically use mark-up for the purpose. There is a drawback in that, a user can only anchor his or her reference in the predefined positions. However, if the text is completely tokenized each single word is available for the purpose (see below).


Any user interface for searching and navigating parts of an object requires new thinking when it comes to persistence policies. The libraries' HTTP hostile resolution procedures fails utterly. There is also much less research in the area persistence and annotation anchors.

Very competent search and navigation systems, such as the eXtensible Text Framework (XTF) from California Digital Library, allow users to search, navigate and link to arbitrary parts of a document. XTF is used to deliver the following document:

Now, here we have extremely good facilities for navigation and search, facilities that goes far beyond what can be delivered by Google. But, alas, the ark based persistent identification layer isn't used for those facilities, and any linking or quotations or references pointing into the document might die with this particular implementation.

The PI people lives with ideal of delivering documents, in the same way as they were delivered at the lending counter 25 years ago.

Linked data

I don't think anyone involved in current web development has escaped the linked data bandwagon. Linked data and the semantic web is believed by many to become the core of Web 3.0. Tim Berners-Lee lists four characteristics of the web of data.

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
  4. Include links to other URIs. so that they can discover more things.

Anyone who wants to jump the bandwagon by means anything else on than than good old cool URIs does so at his or her own risk, and I won't. For, as D'Israeli puts it, 'the art of quotation requires more delicacy in the practice than those conceive, who can see nothing more in a quotation than an extract.'


If you're interested in experimenting with mark-up based anchors, I'll give you a XSLT-script which basically copies a XML document but while doing so it adds an id attribute to every element in your document. I find this extremely useful. You can extend it to tokenize text and add anchors on individual words.

I use such a script when creating search and navigation systems involving XML text. Beware though that a document type may use the generic xml:id attribute. Also, a DTD or schema may have another name for id attribute and also it may not be permissible on all elements.

blog comments powered by Disqus


Subscribe to Stuff from Sigfrid LundbergSubscribe to my stuff
Subscribe to Stuff from Sigfrid LundbergSubscribe to discussion feed

stuff by category || year


My name is Sigfrid Lundberg. The stuff I publish here may, or may not, be of interest for anyone else.

On this site there is material on photography, music, literature and other stuff I enjoy in life. However, most of it is related to my profession as an Internet programmer and software developer within the area of digital libraries at the Royal Library, Copenhagen (Denmark) and, before that, Lund university (Sweden).

The content here does not reflect the views of my past or present employers

Creative Commons License
This entry (A quotation is much more than an extract) within Sigfrid Lundberg's Stuff, by Sigfrid Lundberg is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.