The (in)adequacies of markup
May 21st, 2010 | hugh cayless
(Last minute session idea)
This discussion comes around every few years, most recently on the Humanist list, starting here and continuing for many posts. The essence of it is an argument over whether embedding markup (à la TEI XML) in texts is a theoretically sound way of digitally publishing texts or whether “standoff” markup that points at parts of a (probably plain text) document would be better.
Anyway, if there are any text hackers out there interested in looking at the state of play in document markup and seeing whether we can come to any useful conclusions or hack something together or make plans to hack something together, let me know.
May 21st, 2010 at 9:09 pm
Hugh–
Are you suggesting a practical discussion about how to overlay and weave commentaries and supercommentaries in an egalitarian fashion, presumably using markup tech? I’m not inclined to wade into any kind of debate over the “purity of the turf”, but I’d love to play with some ideas about how we could bum-rush the privileged position of the encoder of digital text.
May 21st, 2010 at 9:27 pm
Hi Adam,
I really only care about what’s practical. The theoretical arguments are interesting but weighted down by ideology. Maybe it’s worth taking another look at where we are technically though.
May 21st, 2010 at 11:50 pm
I’m actually in favor of the standoff markup system for the reason that as long as the text itself is versioned and stable, then having markup actually necesitates copying the source to layer in multiple different markups. The initial purpose of markups is to create a very specific kind of typographical data — the way the original was typeset, any page breaks.
I think the standoff markup approach has the following advantages:
1. It can be generalized to media which cannot be edited with xml. You can add a timecode and thus attach metadata to a video. You can attach a logical code and somehow cite/analyze/encode an interactive work.
2. It allows for violations of nested hierarchy. Properly formatted XML files are trees; not all ideas or classifications in documents can be modeled by a tree.
3. It allows for maintenance of sources in a central location, and allows for only the most important (i.e. the extra data) to be kept in a database. A similar xml database would contain a great deal of redundancy.
4. Offsets into a text file are simple, and can even be used to handle changes between different versions by the utilization of references to stanzas or paragraphs.
May 22nd, 2010 at 4:06 am
there’s another choice — zen markup.
it’s light markup that is so “light” that
it’s nearly invisible.
it converts out to .html and .pdf, but
its simple structure also means that
viewer-apps and authoring-tools can
be programmed for it quite easily…
-bowerbird