Notes from 2007/05/02 meeting
Are we ready for comments on our info architecture proposals from outside the project?
- Should we make everything on rhaptos.org in the info architecture folder public?
- What do we need to do in order to get useful comments (how to phrase docs?, e.g.?)
- What language constructs are we missing in CNXML?
- Study DocBook and TEI for examples
- Whom should we enlist?
- Erik Möller
- Eric Duvall
- Scott Warren
- Kohlhase & Co.
- Dan Chudnov & {code,xml,oss}4lib?
- Paul L. of ActiveMath community
- Ian Barland
- Doug Jones
Issues to consider fall into two categories:
- those for immanent release (CNXML 0.6)
- long-range issues, such as
- compound documents
- indexing
- alternatives for print and other output targets
- process for canonizing elements
Questions about elements
span and div should accept any attribute/value pairs. We all agreed.
cnxn and link
- shouldn't contain themselves (we all agreed)
- value of @document could include a repository ID as well as an object ID
Footnotes (note of type 'footnote') should be part of inline content.
Print rendering of cnxn and link also came up. When an empty cnxn is a reference to an element in the current rendering context (the module when online, the whole collection when creating a book PDF), it is resolved and displayed as a reference to the label of the element and its number.
Numbered propositions: These are important for analytic (Anglo-American) philosophy. Chuck contends that numbered propositions are akin to squentially numbered but discontiguous figures, equations, and examples, rather than to lists. Ross notes that Richard Grandy is interested in publishing content in Connexions, and so numbered propositions would be helpful (necessary, even).
The question of creating a more general class of sequence items that are not necessarily contiguous but could be consistently numbered. Perhaps call them 'sequence-item'. We need a way of denoting that a set of discontiguous sequence items are part of a given sequence--perhaps an attribute named 'of' or 'group'--an arbitrary IDREF-like token. The class attribute would still be the primary carrier for semantics and rendering info.
I think we agreed that numbered propositions were a good idea, and that the sequence item idea deserved consideration for the future. See also below the brief discussion of Grandy's book on sentential logic.
Semantics of name: At present,name denotes the title of the element of which it is the first child (e.g. section, para, list, figure). Really, it should be title instead of name, so that name could have been used for names of people, places, etc. We decided that we could deprecate name in this role and use title in its place in the future.
In the end, I'm not sure how much we gain by deprecating name. The two things we might stand to gain are (a) the ability to re-use name for other more appropriate purposes, and (b) not seeming silly to XML cogniscenti for calling titles of things name. The constraint is that we don't want to break backwards compatibility or change the CNXML namespace URI. We can't really expunge existing name elements from the repository. I think we have two plausible choices for re-using name without breaking backwards compatibility:
- define name only once in the schema, but in the specification indicate that, as the first child element of certain elements (figure, section, para, list, etc.) it will be treated as the title of that element, while all other names will be taken to have the new semantics (names of persons, places, etc.); or
- define name twice in the schema, once with any class value except the ones we want to reserve (e.g. 'place-name', 'person-name'), which will have the semantics of the present name and occur only as the first element child of select elements; and again with the new semantics and with the new enumerated class values, which is part of inline content.
Semantics of cite:
What is cite meant to enclose: a complete citation, or just the title of the work, or what? At present, a non-empty cite is rendered by toggling italics on or off, depending on how that text would otherwise be rendered. This rendering is appropriate only for titles of whole works, like monographs or journals. An empty cite whose src attribute points to the ID of a BibTeXmL reference prefixed by '#' is rendered as a link to the reference with '[
Let's take the empty cite as our basis for thinking about the semantics of the element. (Perhaps this is sometimes a useful approach for thinking about the semantics of any element that may be empty of content.) The empty cite expands to a hyperlink pointing to the corresponding bibiographic reference, with a visual pointer (the system-supplied reference number) as its content. So here it is a proxy for the reference itself. One can easily imagine it being a form of the thing itself as well--perhaps just <lastname, year>, but perhaps also formatted according to one or another style convention (Chicago, Turabian, APA, etc.). So one could have something like this:
which would expand to
So how would you add in information about a particular page or range of pages that is a subset of the citation? Suppose Mendez 2002 is a reference to a whole monograph, but you want to indicate that the quotation in which the empty cite is embedded is taken from one page? I see two reasonable alternatives:
- we provide cite with an attribute to contain this information, e.g.:
<cite src="#Mendez" format="Chicago-author-date" pages="12"/>This would be rendered as:[Mendez 2002, p. 12]
- we render empty cite without the system-supplied square brackets, which makes it possible for authors to add page ranges manually after the empty cite:
[<cite src="#Mendez" format="Chicago-author-date"/>, p. 12]This would be rendered as above. Note the author-supplied square brackets in this case.
For several permutations of cite as it is at present, see this module. Fondren Library licenses an online version of the Chicago Manual of Style (see particularly Chapter 16).
Exercise: how would we represent Richard Grandy's and Daniel Osherson's book? Lot's of fun to be had here.
- Consider pp. 4-5, where (2) has two sub-propositions with alpha labels. If (as I think we should) we provide support for further kinds of list decoration, then this could be handled by alphabetical list item labels. But we can't do this yet.
- Or consider the indented sentence on p. 5 ("John ain't got no anchovies in his ice cream."). This is neither a numbered proposition nor a quote. It's probably best thought of as an un-numbered proposition. But the device of using indented blocks of text is used in analytic philosophy to draw the reader's attention to text that is neither a quote nor a single proposition--to whole paragraphs to be considered, say. See pp. 6-7, 16 for other examples. Exercises 1(6) and 1(7) make explicit the fact that these indented text portions are meant to be considered (and analyzed) by the reader.
- How would we mark up the indented portion of p. 8? Probably as a definition list, though that verges on tag abuse.
- Note that the numbering that is applied to propositions (e.g. (1) and (2) in chapter 1) is continuous with the numbering of other items, such as the exercises (pp. 11-13). I don't think that would be universal practice in disciplines tha tmake use of numbered propositions, however.
- Note that in Chapter 2, the numbered sequence of things includes several genres: DEFINITION, FACT, EXERCISE, POSTULATE, [unlabeled numbered stuff not a proposition], MATHEMATICAL INDUCTION. For this book then the numbered propositions ought to come under the sequence-item element, perhaps with special class values to differentiate them:
<sequence-item id="foo" class="definition" group="main">DEFINITION: ... </sequence-item>
...
<sequence-item id="bar" class="exercise" group="main">EXERCISE: ... </sequence-item>
...
<sequence-item id="baz" class="proposition" group="main">If you negate the negation of a claim &c.</sequence-item>
- Lots of tables, with and without borders, sometimes only with borders separating column heads from coluns (e.g. 4(15) and 4(16)). Support for more detailed table rendering is needed.
