Names on Nodes: MathML Definitions (Draft)

T. Michael Keesey1

1 P.O. Box 292304 Los Angeles, CA, USA 90027; keesey@gmail.com

First Published Online 2009 July 29

Last Updated 2009 September 13

This Is A Draft

This document is not yet complete. It is being posted online to solicit comments before being finalized. Please contact Mike Keesey if you wish to make comments.


Abstract

Phylogenetic nomenclature is a type of biological nomenclature which ties taxonomic names to taxa using algorithms which rely on phylogeny (i.e., patterns of ancestry and descent). Unlike earlier forms of biological nomenclature (e.g., Linnaean, rank-based), the application of a name to a taxon is unambiguous under an appropriate phylogenetic hypothesis. Phylogeny may be modeled as directed, acyclic graphs, with organisms (or sufficiently small taxa) as vertices and parent-child relations as directed edges connecting them. Because phylogeny can be modeled mathematically, phylogenetic definitions may be expressed as mathematical formulae. Mathematical formulae, in turn, may be expressed using an extensible computer language called MathML. Expressing phylogenetic definitions in MathML requires the definition of some additional entities. Here I review the relevant mathematical and biological concepts, terms, and notations, and provide an overview of MathML. I correlate these concepts to each other and, finally, define the entities needed to express phylogenetic definitions in MathML.

Table of Contents

Abstract
Table of Contents
Introduction A General Review of Mathematical Concepts, Terms, and Notation CollectionsSetsListsRelationsGraphsFunctions
A General Review of Biological Concepts and Terms OrganismsTaxonomySpecimensCharacter StatesRank-Based NomenclaturePhylogenyLineagesCladogensCladesNon-CladesPhylogeny-Based Nomenclature
A General Review of MathML and Its Foundational Technologies StringsURIsNamespacesXMLMathML
On the Correlation of Biological and Mathematical Terms Taxa and SetsAncestry and PrecedencePhylogeny and Graphs
Definitions of Mathematical Entities Universal TaxonMaximalMinimalPredecessor UnionSuccessor UnionPredecessor IntersectionSuccessor IntersectionSynapomorphic PredecessorsNode-Based CladogenBranch-Based CladogenApomorphy-Based CladogenCladeNode-Based CladeBranch-Based CladeApomorphy-Based CladeCrown CladeTotal Clade
Appendix I Implemented MathML-Content Elements
Appendix II Example Definitions

Introduction

A General Review of Mathematical Concepts, Terms, and Notation

Some terminology and notation varies across different contexts. Where possible, I have followed MathML's terminology and default notation. Some exceptions have been made for certain logical symbols which are more easily read as words than as symbols, e.g., and instead of , for all instead of , etc.

The symbol := means "is defined as".

Collections

A collection is an entity which consists of zero or more distinct objects. Objects in a collection are members of the collection. A collection with one member is a singleton.

Sets

A set is an unordered collection. When an object, x, is a member of a set, S, this is denoted xS. Sets may themselves be members of other sets. Sets may be denoted in the following ways:

The empty set is the set which includes no members, denoted as .

The symbol indicates the set of all real numbers. The nonnegative real numbers are indicated as 0+.

The cardinality of a set is the number of members in that set, denoted with enclosing vertical bars: |S|.

Examples. |∅| = 0. |{1, 2, 3}| = 3.

If all members of a set, A, are members of a set, B, then A is a subset of B, denoted AB. B is a superset of A. Note that all sets are subsets and supersets of themselves. If AB and AB, then A is a proper subset of B, denoted AB, and B is a proper superset of A. Note that ∅ is a subset of all sets.

The operations of union, intersection, and difference may be applied to sets:

Examples. If A = {1, 2} and B = {2, 3}, then AB = {1, 2, 3}, AB = {2}, and AB = {1}.

A partition of a set, S, is a set of subsets of S, so that no sets in the partition overlap and all members of S are members of some set in the partition. A partition, P1, is a refinement of another partition, P2, if every member of P1 is a subset of some member of P2. P1 is finer than P2, and P2 is coarser than P1. This is written P1P2.

Example. If S = {1, 2, 3}, then the partitions of S are {∅, S}, {∅, {1}, {2, 3}}, {∅, {1, 2}, {3}}, {∅, {1, 3}, {2}}, and {∅, {1}, {2}, {3}}. The partition {∅, {1}, {2}, {3}} is a refinement of {∅, {1}, {2, 3}}, which is a refinement of {∅, S}.

The set consisting of all (relevant) objects is called the universal set. The difference of the universal set and a set, S, is the complement of S, denoted S.

A power set of a set, S, is the set of all subsets of S, denoted 2S.

Example. If S = {1, 2}, then 2S = {∅, {1}, {2}, {1, 2}}.

Lists

A list is an ordered collection of elements, denoted as a series of elements within brackets, e.g., [x, y, z]. Unlike sets, lists may have the same member multiple times, e.g., [x, x, y]. A list with two members is an ordered pair. A list with three members is an ordered triple. A list with n members is an n-tuple. The nth member of a list, p, is denoted pn.

Example. If p = [x, y, z], then p1 = x, p2 = y, and p3 = z. The list, p, is an ordered triple (3-tuple).

A Cartesian product of two sets, A and B, is the set of all ordered pairs wherein the first element is a member of A and the second element is a member of B. This product is denoted as A × B. The product A × A may also be denoted as A2. Cartesian products may be generalized to cover any whole number, n, of sets, in which case the members of the product are n-tuples.

Example. {1, 2} × {3, 4} = {[1, 3], [1, 4], [2, 3], [2, 4]}.

Relations

A relation is a set of ordered pairs. If [x, y] is a pair in the relation R, then this is denoted as x R y. The first element in such a pair is the predecessor, and the second element is the successor. If x R y and xy, then x is a proper predecessor of y and y is a proper successor of x. If x R z and there is no other element, y, such that x R y and y R z, then x is an immediate predecessor of z and z is an immediate successor of x.

The expression R[x] denotes the set {y | x R y}. For example, >[0] indicates the set of all negative numbers.

A partial order is a relation with the properties of reflexivity, antisymmetry, and transitivity:

The transitive closure of a relation, R, is the smallest (i.e., least inclusive) transitive relation that includes R.

If R is a partial order and x R y or y R x, then x and y are comparable. If all elements in a set can be compared to each other, then the set is a chain. If no two different elements in a set are comparable, then it is an antichain.

Graphs

A graph is an entity containing a set of objects, called vertices, and connections of the vertices, called edges. A graph may be defined as a type of ordered pair, in which the first element is the vertex set and the second element is the edge set: [V, E]. In an undirected graph, each edge is a set of two vertices, indicating that those vertices are connected, or incident. In a directed graph, or digraph, each edge is an ordered pair of vertices, indicating that the first element, or head, connects to the second element, or tail. Edges in directed graphs may also be called arcs.

A walk in a graph is a list of vertices in which each vertex is incident to the next vertex in the list. A path is a walk in a directed graph wherein some arc in the graph points from each vertex to the next vertex in the list. A cycle is a path which begins and ends with the same vertex. A directed graph is said to be acyclic if there are no cycles in it.

Functions

A function maps an element, called an argument, to a value. Formally, a function may be defined as an ordered triple of three sets: f := [X, Y, F]. The final set, F, is a set of ordered pairs, wherein the first element is the argument and the second element is the value. There may be only one ordered pair per argument. If an ordered pair, [x, y], is a member of F, this may be denoted as f(x) = y. If the argument is a list, instead of f([a1, a2, … an]), it is customary to simply write f(a1, a2, … an). Sometimes this may be written a1 f a2 fan (infix notation).

The set including all of a function's arguments is the domain. All values of the function are within the codomain. The set of all values is the image, which is a subset of the codomain. If a function, f, has domain X and codomain Y, this is denoted as f: XY.

The composite of two functions, f and g, is a function which uses the value of g as an argument for f. Composition is written fg, so that (fg)(x) = f(g(x)). Note that the codomain of g must be a subset of the domain of f. If g : XY and f : YZ, then (fg) : XZ.

A metric on a set, X, is a function with X2 as its domain and the set of nonnegative real numbers as its codomain. It defines a metric distance between any two members of X. The ε-ball of x is the set of all elements less than a certain distance, ε, from x.

A General Review of Biological Concepts and Terms

Organisms

An organism is an individual living entity. Although in many cases it is clear what constitutes an individual, there are notable difficult cases. However, although the mathematical system used by Names on Nodes assumes that there are discrete, individual entities, the implemented algorithms need never deal with them directly. Thus there is no present need to clarify the term.

Taxonomy

A nonempty set of organisms is a taxon (plural: taxa). A taxon whose members are all within another taxon is a subtaxon of that other taxon. A taxon which includes all members of another taxon is a supertaxon of that taxon. The most inclusive taxon is the universal taxon, which includes all organisms. A taxonomy is a system for dividing a taxon into subtaxa.

A taxonomic name is a word or phrase which signifies a taxon. A nomenclatural code is a code of rules which governs taxonomic names.

Specimens

In addition to taxonomic names, taxa may also be referenced using specimens. A specimen is an object which has been catalogued as part of a specimen collection. A specimen collection is often indicated by an abbreviation of its name, specified within the context, e.g., "Yale Peabody Museum: Vertebrate Paleontology Collection" may be abbreviated as "YPM-VP". A specimen within a collection may be indicated by the collection's name or abbreviation followed by an identifier that is unique within the collection, e.g., YPM-VP 1450. A specimen may have multiple identifiers if it has been transferred from one collection to another. For example, AMNH 973 and CM 9380 are the same specimen. A specimen may represent no organisms (e.g., a mineralogical specimen), one organism (e.g., a fossil skeleton), or multiple organisms (e.g., a microbe slide).

Character States

Taxa may also be defined intensionally using a description of a necessary characteristic, that is, a character state. Organisms exhibiting the state are part of the taxon. Valid states must be discrete and absolute, that is, organisms cannot partially exhibit them.

Examples. "Cellular nucleus present" is a valid state, assuming that "cellular nucleus" has been defined in such a way that it is either present or not, never partially present. "Large leaf size" is not a valid state, since it is relative, not absolute.

Taxa may be defined using a collection of character states. If all states are required for membership, the taxon is monothetic. If at least one state is required, the taxon is polythetic.

Taxonomic names are not usually defined according to character states, but the taxa that the names signify may be diagnosed by character states. For example, the taxon referred to as "Eukaryota" is diagnosed by the presence of cellular nuclei, but "Eukaryota" is not necessarily defined by that character state.

Rank-Based Nomenclature

Taxonomic names may be loosely defined using rank-based definitions. A taxon is rank-defined by specifying one or more type specimens or a type subtaxon (generally, a type), which must be included in the taxon by definition, and a rank, which indicates relative inclusivity of the taxon. Commonly used ranks are, from least to most inclusive, species, genus, family, order, phylum (zoology) or division (other disciplines), and kingdom. Many others exist as well.

Note that rank-based definitions do not dictate any criteria for membership, apart from the requirement that the type (specimens or subtaxon) must represent included organisms.

If a taxonomic name, "X", is defined as having a taxon with name "Y" as its type, and "Y" is defined as having specimen Z as its type, then Z may be called the finest type of "X".

A rank-based code is a nomenclatural code which governs rank-based definitions. Currently there are four in effect:

Name of Rank-Based Code Abbreviation Organisms Covered
International Code of Botanical Nomenclature ICBN plants, fungi, some other eukaryotes
International Code of Zoological Nomenclature ICZN animals, some other eukaryotes
International Code of Nomenclature of Prokaryotes ICNP prokaryotes
International Code of Nomenclature for Cultivated Plants ICNCP cultivated plants

Examples. Under ICZN rules, the name "Tyrannosaurus rex" refers to a taxon of the species rank typified by CM 9380 (formerly AMNH 973). Therefore, Tyrannosaurus rex must include the organism represented by that specimen. The name "Tyrannosaurus" refers to a taxon of genus rank typified by Tyrannosaurus rex, so it must be a supertaxon of that species (and, by extension, it must include the organism represented by CM 9380). The name "Tyrannosauridae" refers to a taxon of family rank typified by Tyrannosaurus, so it must be a supertaxon of Tyrannosaurus.

Phylogeny

Every organism has one or more ancestors and/or one or more descendants. An immediate ancestor is a parent, and an immediate descendant is a child. The pattern of ancestry and descent among organisms is phylogeny.

Although a fuller correlation wil be made further on, I note here that the biological term "ancestor" correlates to the mathematical term "proper predecessor", and the biological term "descendant" correlates to the mathematical term "proper successor". Therefore, we may say that a predecessor of an organism is any ancestor of that organism, or that organism itself. Conversely, a successor of an organism is any descendant of that organism, or that organism itself. I also note that the terms "maximal" and "minimal" may be applied to organisms with regard to taxa. The minimal members of a taxon are those which are not descended from any other member. The maximal members of a taxon are those which are not ancestral to any other member.

The common predecessors of a taxon are all organisms which are predecessors of all members of that taxon. The common successors of a taxon are all organisms which are successors of all members of that taxon.

The exclusive predecessors of a taxon, A, with regard to another taxon, Z, are all common predecessors of A except any which are predecessors of any member of Z. A may be termed the internal taxon and B may be termed the external taxon

The apomorphic predecessors of a taxon, A, with regard to another (generally character-based) taxon, M, are all common predecessors of A which are also members of M. A may be termed the representative taxon and M may be termed the apomorphic taxon

Lineages

A lineage is a sequence of organisms wherein each organism is followed by one of its children.

A synapomorphic predecessor of a taxon, A, with regard to taxon M, is an apomorphic predecessor for which there is at least one lineage for every member of A satisfying the following conditions:

  1. The first member of the lineage (i.e., the ancestor of all other members) is the apomorphic predecessor.
  2. The last member of the lineage (i.e., the descendant of all other members) is the member of A.
  3. All members of the lineage are members of M.

Cladogens

A taxon which fulfills the following requirements is here termed a cladogen (new term; previously "cladogenetic set" in Keesey [2007]):

All singleton taxa are cladogens, but cladogens may have larger numbers of members as well.

A node-based cladogen consists of the maximal common predecessors of a taxon.

A branch-based cladogen consists of the minimal exclusive predecessors of an internal taxon with regard to an external taxon.

An apomorphy-based cladogen consists of the minimal synapomorphic predecessors of a representative taxon with regard to an apomorphic taxon.

Clades

A clade is a taxon including all members of a cladogen and all descendants of all of those members. Clades are monophyletic; in fact, "clade" is a synonym of "monophyletic taxon".

A node-based clade consist of a node-based cladogen and all descendants of all of its members. A branch-based clade consist of a branch-based cladogen and all descendants of all of its members. An apomorphy-based clade consist of an apomorphy-based cladogen and all descendants of all of its members.

Non-Clades

If a taxon's minimal members form a cladogen, but the taxon does not include not all descendants of that cladogen, then it is paraphyletic. (Note that cladogens themselves are paraphyletic, with the exception of singleton cladogens wherein the single member has no descendants. For such singleton cladogens, the clade and cladogen are identical.)

If the maximal common predecessors (i.e., the members of the node-based cladogen) of a taxon are not included within that taxon, then that taxon is polyphyletic.

Phylogeny-Based Nomenclature

A taxon may be strictly defined according to phylogeny, using a phylogeny-based definition. Most commonly, the taxa defined are clades, but other types of taxa may also be phylogenetically defined.

A phylogeny-based code is one which governs phylogeny-based definitions. Currently there are no such codes in effect, but there is a draft of one called the International Code of Phylogenetic Nomenclature (or the PhyloCode, for short). This code is intended to go into effect in the next few years, exist alongside the rank-based codes, and govern clade names for all types of organisms.

A General Review of MathML and Its Foundational Technologies

Strings

A string is a sequence of characters. Strings which are meant to be interpreted by a computer are referred to as code. Literal strings are referred to as text. A string which identifies an object is a name.

URIs

A Uniform Resource Identifier, or URI, is a string identifying a resource on the Internet.

One of the most common types of URI is the Uniform Resource Locator, or URL, which specifies an address and a mechanism for retrieval. For example, a URL identifying this document is http://namesonnodes.org/ns/math/2009/ (http is the retrieval mechanism, i.e., Hypertext Transfer Protocol, and namesonnodes.org/ns/math/2009 is the address).

Another type of URL is the Uniform Resource Name, or URN, which functions as a location-independent name. Many types of identification can be expressed as URNs. For example:

A URN resolver translates URNs (i.e., names) into URLs (i.e., locations).

For more on URIs, see these official specifications:

Namespaces

Generally, a namespace is a set of names, called local names, each of which has a single meaning in the context of the namespace. Namespaces are commonly identified using URIs, which then function as namespace identifiers. For example, this document is associated with the Name on Nodes mathematical namespace, which may be identified using the URI http://namesonnodes.org/ns/math/2009. In some contexts, a shorter identifier may be equated with a URI.

Note that taxonomic publications (including nomenclatural codes) and specimen collections may be considered types of namespaces, wherein taxonomic names and specimen identifiers, respectively, function as local names.

A qualified name is an expression joining a namespace identifier with a local name. Different computer languages have different methods of joining these. One convention is to use one or two colons (":"). For example, the qualified name urn:isbn:00800694::Pinus may be used informally to refer to the entity identified as "Pinus" by the International Code of Botanical Nomenclature (ISBN 0080-0694).

XML

Extensible Markup Language, or XML for short, is a specification for creating markup languages. Text in XML may be surrounded with tags: an opening tag, of the form <tagName>, and a closing tag, of the form </tagName>, where tagName is the name of the tag. For example, in the XML expression <sentence>Hello, world!</sentence>, the text "Hello, world!" has been marked up by sentence tags. XML tags may also included nested tags, for example: <sentence><word>Hello</word>, <word>world</word>!</sentence>. The entire stucture consisting of an opening tag, content, and a closing tag is an element. Any elements within an element's content are child elements. An element with no content, i.e., an empty element, may be written as a self-closing tag: <tagName/>.

Both opening and self-closing XML tags may be augmented with attributes, each of which pairs a name to a value: <tagName attrName="attrValue"/>. An XML tag may have any number of attributes, as long as they all have different names.

Tag and attribute names may be qualified names. Consider the following XML code:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:m="http://www.w3.org/1998/Math/MathML">
    <head>
        <title>XML Namespaces Example<title>
    </head>
    <body>
        <div>This is XHTML.<div>
        <div>The following is MathML:<div>
        <m:math>
	        <m:apply>
		        <m:sin/>
		        <m:ci>x</m:ci>
	        </m:apply>
        </m:math>
    </body>
</html>

In this example, the default namespace is identified by http://www.w3.org/1999/xhtml (which identifies the XHTML namespace), and a namespace identifier, m, is synonymized with http://www.w3.org/1998/Math/MathML (which identifies the MathML namespace). Therefore, if a tag or attribute's name is unqualified, then it is interpreted as an XHTML name. If a tag or attribute's name is qualified by the prefix "m:", then it is interpreted as a MathML name.

For more on XML and XML namespaces, see these official specifications:

MathML

Mathematical Markup Language, or MathML, is an XML language for expressing mathematical concepts. Elements in MathML are divided into two major groups: MathML-Presentation, which contains information on how to render expressions visually, and MathML-Content, which models mathematical entities. Name on Nodes uses a relevant subset of MathML-Content.

An important element in MathML is the apply element. This indicates that the first child element is to be interpreted as an operation (i.e., a function, relation, etc.), and the subsequent child elements are to be used as arguments.

Example. The MathML element <apply xmlns="http://www.w3.org/1998/Math/MathML"><sin/><cn>0</cn></apply> indicates that the sine function (sin) is to be applied to the constant number, 0 (zero).

Another important element is the csymbol element, which allows the creation of custom-defined mathematical entities. This is commonly achieved through use of the csymbol element's definitionURL attribute. In Names on Nodes, the value of the definitionURL attribute may be:

Example. The following MathML element indicates a node-based clade consisting of all successors of the maximal common predecessors of the types of botanical species Lycopodium clavatum, Huperzia selago, Isoëtes lacustris, and Selaginella apoda. (This is Cantino et al.'s [2007] definition of the clade name "Lycopodiophyta".)

<apply xmlns="http://www.w3.org/1998/Math/MathML">
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedClade"/>
    <apply>
        <union/>
        <csymbol definitionURL="urn:isbn:00800694::Lycopodium+clavatum"/>
        <csymbol definitionURL="urn:isbn:00800694::Huperzia+selago"/>
        <csymbol definitionURL="urn:isbn:00800694::Isoetes+lacustris"/>
        <csymbol definitionURL="urn:isbn:00800694::Selaginella+apoda"/>
    </apply>
</apply>

On the Correlation of Biological and Mathematical Terms

Taxa and Sets

As mentioned, taxa are a type of set. Thus, the operations defined for sets may be employed with taxa. Let U be the universal taxon, the set that includes all organisms. To be a taxon, a set must be a nonempty subset of U.

A union of taxa, T1T2 ∪ … ∪ Tn, constitutes a polythetic set. An intersection of taxa, T1T2 ∩ … ∩ Tn, constitutes a monothetic set.

A rank-based taxonomy of a taxon, T, may be considered a series of partitions on T, wherein each partition corresponds to a rank. Partitions of lower ranks are refinements of partitions of higher ranks, e.g., a species-level partition is a refinement of a genus-level partition.

Ancestry and Precedence

Parenthood may be defined as an antisymmetric, nontransitive relation. Let the relation ⊲ := {[x, y] | x is a parent of y}. The expression xy means that x is a parent, or immediate predecessor, of y. The inverse relation, , is childhood. The symmetric relation, , may be used thusly: xy if and only if xy or xy. The expression xy means that x is a parent of or equal to y, that is, x either immediately precedes or equals y. The expression ⊴[x] represents the set of x and all of its children (immediate successors).

Ancestry may be defined as the transitive closure of parenthood. Let the relation ≺ := {[x, y] | x is an ancestor of y}. The expression xy means that x is an ancestor, or proper predecessor, of y. The inverse relation, , is descent. The expression xy means that x is an ancestor of or equal to y, that is, x is a predecessor of y. The expression ≼[x] represents the set of x and all of its successors. The expression ≽[x] represents the set of x and all of its predecessors.

Phylogeny and Graphs

Phylogeny may be represented as a directed, acyclic graph (which correlates to a partially ordered set), G := [U, ⊲]. In this digraph, organisms are the vertices. The arcs (directed edges) point from parents to their children, so that the head of each arc is a parent and the tail of each arc is a child.

A path in G represents a lineage from ancestor to descendant. An xy path in G is a sequence of vertices (organisms), p, of length n such that x = p1 and y = pn and p1p2 ⊲ … ⊲ pn.

A cladogen is an antichain in G wherein all members share at least one common successor. As noted earlier, all singleton subsets of U are cladogens.

Relatedness may be represented as an undirected graph, G := [U, {{x, y} | xy}]. If two organisms (vertices) in this graph are connected, then they are in some way related. (Note that all known organisms are considered to be related.)

Definitions of Mathematical Entities

The following information is given for each mathematical/biological entity defined in this document:

Universal Taxon

Definition URL http://namesonnodes.org/ns/math/2009#def-UniversalTaxon
Symbol U
Class Set
Definition

U is the set of all organisms, be they extinct, extant, or yet to be.

Implementation org.namesonnodes.math.entities::Taxon.fromFinestNodes and org.namesonnodes.domain.nodes::NodeGraph.allFinestNodes
Discussion

(See the discussion of universal taxon, above.)

Operationally, Names on Nodes treats the smallest discernible taxa in a given context as individuals. The union of these least inclusive taxa is treated as the operational equivalent of U.

Example
<apply xmlns="http://www.w3.org/1998/Math/MathML">
    <card/>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/>
</apply>
This evaluates to the total number of organisms. Obviously, this number is actually unknown. Operationally, this expression evaluates to the union of all taxa being treated in the context.

Maximal

Definition URL http://namesonnodes.org/ns/math/2009#def-Maximal
Symbol max
Class Function
Definition

max : 2U → 2U
max(A) := {xA | for all yA, xy}

Implementation org.namesonnodes.math.operations::Maximal
Example
<apply xmlns="http://www.w3.org/1998/Math/MathML">
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Maximal"/>
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/>
</apply>
(This evaluates to the set of all organisms with no descendants.)
Discussion

The maximal members of a set are all members which are not strictly ancestral to any other member.

The concept of "maximal" correlates to what some authors have termed "last", "latest", or "most recent", as in "most recent common ancestor". However, unlike those other terms, "maximal" is not tied to chronology; the maximal members of a set are not necessarily contemporaries.

Other potential synonyms of "maximal" are "final", "terminal", or "leafmost".

The symbol for this function is the same as that of a MathML function, max; however, the MathML function's domain is the power set of real numbers, not the power set of the universal taxon.

Minimal

Definition URL http://namesonnodes.org/ns/math/2009#def-Minimal
Symbol min
Class Function
Definition

min : 2U → 2U
min(A) := {xA | for all yA, xy}

Implementation org.namesonnodes.math.operations::Minimal
Example
<apply xmlns="http://www.w3.org/1998/Math/MathML">
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Minimal"/>
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/>
</apply>
(This evaluates to the set of all organisms with no ancestors, i.e., the original organism[s].)
Discussion

The minimal members of a set are all members which are not strictly descended from any other member.

The concept of "minimal" correlates to what some authors have termed "earliest", "first", or "least recent", as in "least recent common ancestor". However, unlike those other terms, "minimal" is not tied to chronology; the minimal members of a set are not necessarily contemporaries.

Other potential synonyms of "minimal" are "initial" or "rootmost".

The symbol for this function is the same as that of a MathML function, min; however, the MathML function's domain is the power set of real numbers, not the power set of the universal taxon.

Predecessor Union

Definition URL http://namesonnodes.org/ns/math/2009#def-PredecessorUnion
Symbol prc
Class Function
Definition

prc : 2U → 2U

prc(A) := {x ∈ U | for some yA, xy}

or
prc(A) := ≽[x]
xA
Implementation org.namesonnodes.math.operations::PredecessorUnion
Example
<apply xmlns="http://www.w3.org/1998/Math/MathML">
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-PredecessorUnion"/>
	<apply>
		<union/>
		<csymbol definitionURL="urn:isbn:0853010064::Homo+sapiens"/>
		<csymbol definitionURL="urn:isbn:00800694::Pinus+sylvestris"/>
	</apply>
</apply>
(This evaluates to a set including all humans, all Scots pines, all ancestors of all humans, and all ancestors of all Scots pines. This includes shared ancestors as well as unshared ancestors.)
Discussion

The predecessor union of a set of organisms includes all members of that set as well as all ancestors of all members of that set.

The predecessor union of a set is always a superset of the predecessor intersection.

Successor Union

Definition URL http://namesonnodes.org/ns/math/2009#def-SuccessorUnion
Symbol suc
Class Function
Definition

suc : 2U → 2U

suc(A) := {x ∈ U | for some yA, xy}

or
suc(A) := ≼[x]
xA
Implementation org.namesonnodes.math.operations::SuccessorUnion
Example
<apply xmlns="http://www.w3.org/1998/Math/MathML">
	<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-SuccessorUnion"/>
	<apply>
		<union/>
		<csymbol definitionURL="urn:isbn:0853010064::Equus+ferus"/>
		<csymbol definitionURL="urn:isbn:0853010064::Equus+asinus"/>
	</apply>
</apply>
(This evaluates to a set including all horses [Equus ferus], donkeys [Equus asinus], mules [Equus asinus × ferus] and hinnies[Equus ferus × asinus].)
Discussion

The successor union of a set of organisms includes all members of that set as well as all descendants of all members of that set.

The successor union of a set is always a superset of the successor intersection.

Predecessor Intersection

Definition URL http://namesonnodes.org/ns/math/2009#def-PredecessorIntersection
Symbol prc
Class Function
Definition

prc : 2U → 2U

prc(A) := {x ∈ U | for all yA, xy}

or
prc(A) := ≽[x]
xA
Implementation org.namesonnodes.math.operations::PredecessorIntersection
Discussion

Successor Intersection

Definition URL http://namesonnodes.org/ns/math/2009#def-SuccessorIntersection
Symbol suc
Class Function
Definition

suc : 2U → 2U

suc(A) := {x ∈ U | for all yA, xy}

or
suc(A) := ≼[x]
xA
Implementation org.namesonnodes.math.operations::SuccessorIntersection
Discussion

Synapomorphic Predecessors

Definition URL http://namesonnodes.org/ns/math/2009#def-SynapomorphicPredecessors
Symbol synprc
Class Function
Definition

synprc : 2U × 2U → 2U

synprc(M, A) := {x ∈ prc(A) | for all yA, there exists some xy path, p, in G where for all pnp, pnM}

Implementation org.namesonnodes.math.operations::SynapomorphicPredecessors
Discussion

Node-Based Cladogen

Definition URL http://namesonnodes.org/ns/math/2009#def-NodeBasedCladogen
Symbol +
Class Function
Definition

+ : 2U → 2U

+ := max ∘ prc

A + B + ... + Z := (max ∘ prc)(AB ∪ ... ∪ Z)

Implementation org.namesonnodes.math.operations::NodeBasedCladogen
Discussion

The node-based cladogen of a set, A, consists of its maximal common predecessors. This is a similar concept to "most recent common ancestors".

This operation has two forms of notation: 1) as a prefix; and 2) as an infix, which is shorthand for applying the function to a union of sets.

If (and only if) A has no common predecessors, then +(A) = ∅ and A has no node-based cladogen. Since all known organisms are theorized to descend from common ancestors, node-based cladogens exist for all known taxa, in theory.

Branch-Based Cladogen

Definition URL http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen
Symbol
Class Function
Definition

← : 2U × 2U → 2U

AZ := min(prc(A) − prc(Z))

Implementation org.namesonnodes.math.operations::BranchBasedCladogen
Discussion

Specifying a branch-based cladogen requires two sets, one internal (A) and one external (Z). The exclusive common predecessors of the internal set are all of its common predecessors minus all predecessors of the external set. The branch-based predecessors are the minimal exclusive common ancestors of the internal set.

If A has no common predecessors, or all of those common predecessors are also predecessors of Z, then AZ = ∅ and there is no branch-based cladogen for [A, Z].

Apomorphy-Based Cladogen

Definition URL http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedCladogen
Symbol in
Class Function
Definition

in : 2U × 2U → 2U

in := min ∘ synprc

M in A := (min ∘ synprc)(M, A)

Implementation org.namesonnodes.math.operations::ApomorphyBasedCladogen
Discussion

Specifying an apomorphy-based cladogen requires two sets, one apomorphic (M) and the other representative (A). These two sets indicate synapmorphic predecessors. The apomorph-based cladogens consists of the minimal synapomorphic predecessors.

If AM, then M in A = ∅ and there is no apomorphy-based cladogen for [M, A]. There is also no apomorphy-based cladogen if at least two members of A are in M due to convergence, i.e., if there are no synapomorphic predecessors.

Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-Clade
Symbol Clade
Class Function
Definition

Clade : 2U → 2U

Clade(A) := { suc(A), if A = min(A ) and suc(A) ≠ ∅;
∅, otherwise.
Implementation org.namesonnodes.math.operations::Clade
Discussion TBD

Node-Based Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-NodeBasedClade
Symbol Clade+
Class Function
Definition

Clade+ : 2U → 2U

Clade+ := Clade ∘ +
= suc ∘ max ∘ prc

Implementation org.namesonnodes.math.operations::NodeBasedClade
Discussion TBD

Branch-Based Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-BranchBasedClade
Symbol Clade
Class Function
Definition

Clade : 2U × 2U → 2U

Clade := Clade ∘ ←

Clade(A, Z) = suc(prc(A) − prc(Z))

Implementation org.namesonnodes.math.operations::BranchBasedClade
Discussion TBD

Apomorphy-Based Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedClade
Symbol Cladein
Class Function
Definition

Cladein : 2U × 2U → 2U

Cladein := Clade ∘ in
= suc ∘ synprc

Implementation org.namesonnodes.math.operations::ApomorphyBasedClade
Discussion TBD

Crown Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-CrownClade
Symbol Crown
Class Function
Definition

Crown : 2U × 2U → 2U

Crown(A, E) := Clade+(suc(A) ∩ E)

Implementation org.namesonnodes.math.operations::CrownClade
Discussion TBD

Total Clade

Definition URL http://namesonnodes.org/ns/math/2009#def-TotalClade
Symbol Total
Class Function
Definition

Total : 2U × 2U → 2U

Let C = Crown(A, E)

Total(A, E) := Clade(CEC)

Implementation org.namesonnodes.math.operations::TotalClade
Discussion TBD

Appendix I.—Implemented MathML-Content Elements

MathML-Content provides methods for modelling a wide variety of mathematical entities. Since Names on Nodes only deals with logic and set theory, only certain elements have been implemented. The following is a list of all MathML-Content element which have been implemented in Names on Nodes, with notes as necessary. Other elements may work, but are not guaranteed.

and
apply
ci

The type attribute may be set to "boolean" or "set".

csymbol

The definitionURL attribute must be set to one of the following:

The type attribute may be set to "set".

declare

The type attribute should be set to "boolean" or "set".

emptyset

May be used wherever taxa may be used.

eq
false
implies
intersect
math

The last child element of a math element is interpreted as the intended value. Any previous elements should be declare elements. Other elements are permissible before the last element, but are superfluous.

neq
not
notprsubset
notsubset
or
otherwise
piece
piecewise

The type attribute should be set to "set".

prsubset
setdiff
subset
true
union
xor

Appendix II.—Example Definitions

Taxon Name Dinosauria
Definition Type Node-Based
Authorities Referenced The International Code of Zoological Nomenclature, Fourth Edition (ISBN: 0-85301-0006-4)
Prose

All successors of the maximal common predecessors of Megalosaurus bucklandii Mantell 1827, Iguanodon bernissartensis Boulenger in Beneden 1881, and Hylaeosaurus armatus Mantell 1833.

Mathematical Formulae

Dinosauria := Clade+(Megalosaurus bucklandiiIguanodon bernissartensisHylaeosaurus armatus).

Or:

Dinosauria := Clade(Megalosaurus bucklandii + Iguanodon bernissartensis + Hylaeosaurus armatus).

MathML Formulae
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedClade">
    <csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
    <csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
    <csymbol definitionURL="urn:isbn:0853010064::Hylaeosaurus+armatus"/>
  </apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
    <apply>
      <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedCladogen">
      <csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
      <csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
      <csymbol definitionURL="urn:isbn:0853010064::Hylaeosaurus+armatus"/>
    </apply>
  </apply>
</math>
Taxon Name Saurischia
Definition Type Branch-Based
Authorities Referenced The International Code of Zoological Nomenclature, Fourth Edition (ISBN: 0-85301-0006-4)
Prose

All successors of the minimal common predecessors of Megalosaurus bucklandii Mantell 1827 exclusive of all predecessors of Iguanodon bernissartensis Boulenger in Beneden 1881.

Mathematical Formulae

Saurischia := Clade(Megalosaurus bucklandii, Iguanodon bernissartensis).

Or:

Saurischia := Clade(Megalosaurus bucklandiiIguanodon bernissartensis).

MathML Formulae
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedClade">
    <csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
    <csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
  </apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
    <apply>
      <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
      <csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
      <csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
    </apply>
  </apply>
</math>
Taxon Name Avialae
Definition Type Apomorphy-Based
Authorities Referenced
  • The International Code of Zoological Nomenclature, Fourth Edition (ISBN: 0-85301-0006-4)
  • Gauthier, J. and K. de Queiroz (2001 Dec). Feathered dinosaurs, flying dinosaurs, crown dinosaurs, and the name "Aves". Pages 8–47 in J. Gauthier and L. F. Gall (eds.) New Perspectives on the Origin and Early Evolution of Birds: Proceeding of the International Symposium in Honor of John H. Ostrom 1999 Feb. 12–14. Peabody Mus. Nat. Hist., Yale. Univ., New Haven, CT. (ISBN: 0-91253-257-2)
Prose

All successors of the minimal predecessors of Vultur gryphus Linnaeus 1758 to share the synapomorphy of wings used for powered flight (Gauthier and de Queiroz 2001).

Mathematical Formulae

Avialae := Cladein("wings used for powered flight", Vultur gryphus).

Or:

Avialae := Clade("wings used for powered flight" in Vultur gryphus).

MathML Formulae
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedClade">
    <csymbol definitionURL="urn:bici:0912532572(200112)%3C7:FDFDCD%3E2.0.TX;2-H
::wings+used+for+powered+flight"/>
    <csymbol definitionURL="urn:isbn:0853010064::Vultur+gryphus"/>
  </apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
    <apply>
      <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedCladogen">
      <csymbol definitionURL="urn:bici:0912532572(200112)%3C7:FDFDCD%3E2.0.TX;2-H
::wings+used+for+powered+flight"/>
      <csymbol definitionURL="urn:isbn:0853010064::Vultur+gryphus"/>
    </apply>
  </apply>
</math>
Taxon Name Tracheophyta
Definition Type Branch-Modified Node-Based
Authorities Referenced
  • The International Code of Botanical Nomenclature (Vienna Code) (ISBN: 3-906166-48-1)
  • Cantino P. D., J. A. Doyle, S. W. Graham, W. S. Judd, R. G. Olmstead, D. E. Soltis, P. S. Soltis & M. J. Donoghue (2007 Aug). Towards a phylogenetic nomenclature of Tracheophyta. Taxon 56(3):822–846. (ISSN: 0040-0262)
Prose

All successors of the maximal common predecessors of all extant (Cantino & al. 2007) successors of the minimal predecessors of Zea mays L. 1753 exclusive of all predecessors of Phaeoceros laevis (L.) Prosk. 1951, Marchantia polymorpha L. 1753, and Polytrichum commune Hedw. 1801.

Mathematical Formula

Tracheophyta := Crown(Zea maysPhaeoceros laevisMarchantia polymorphaPolytricha commune, "extant")

.
MathML Formula
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-CrownClade">
    <apply>
      <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
      <csymbol definitionURL="urn:isbn:3906166481::Zea+mays">
      <apply>
        <union/>
        <csymbol definitionURL="urn:isbn:3906166481::Phaeoceros+laevis">
        <csymbol definitionURL="urn:isbn:3906166481::Marchantia+polymorpha">
        <csymbol definitionURL="urn:isbn:3906166481::Polytrichum+commune">
      </apply>
    </apply>
    <csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
  </apply>
</math>
Taxon Name Pan-Tracheophyta
Definition Type Total
Authorities Referenced
  • The International Code of Botanical Nomenclature (Vienna Code) (ISBN: 3-906166-48-1)
  • Cantino P. D., J. A. Doyle, S. W. Graham, W. S. Judd, R. G. Olmstead, D. E. Soltis, P. S. Soltis & M. J. Donoghue (2007 Aug). Towards a phylogenetic nomenclature of Tracheophyta. Taxon 56(3):822–846. (ISSN: 0040-0262)
Prose

All successors of the minimal predecessors of Tracheophyta exclusive of all predecessors of extant (Cantino & al. 2007) non-tracheophytes.

The total clade of Tracheophyta.

Mathematical Formula

Let Tracheophyta := Crown(Zea maysPhaeoceros laevisMarchantia polymorphaPolytricha commune, "extant").

Pan-Tracheophyta := Total(Tracheophyta, "extant").

MathML Formula
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <declare type="set">
    <ci>Tracheophyta</ci>
    <apply>
      <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-CrownClade">
      <apply>
        <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
        <csymbol definitionURL="urn:isbn:3906166481::Zea+mays">
        <apply>
          <union/>
          <csymbol definitionURL="urn:isbn:3906166481::Phaeoceros+laevis">
          <csymbol definitionURL="urn:isbn:3906166481::Marchantia+polymorpha">
          <csymbol definitionURL="urn:isbn:3906166481::Polytrichum+commune">
        </apply>
      </apply>
      <csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
    </apply>
  </declare>
  <apply>
    <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-TotalClade">
    <ci>Tracheophyta</ci>
    <csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
  </apply>
</math>