Names on Nodes: MathML Definitions (Draft)
1 P.O. Box 292304 Los Angeles, CA, USA 90027; keesey@gmail.com
First Published Online 2009 July 29
Last Updated 2009 September 13
This Is A Draft
This document is not yet complete. It is being posted online to solicit comments before being finalized. Please contact Mike Keesey if you wish to make comments.
Abstract
Phylogenetic nomenclature is a type of biological nomenclature which ties taxonomic names to taxa using algorithms which rely on phylogeny (i.e., patterns of ancestry and descent). Unlike earlier forms of biological nomenclature (e.g., Linnaean, rank-based), the application of a name to a taxon is unambiguous under an appropriate phylogenetic hypothesis. Phylogeny may be modeled as directed, acyclic graphs, with organisms (or sufficiently small taxa) as vertices and parent-child relations as directed edges connecting them. Because phylogeny can be modeled mathematically, phylogenetic definitions may be expressed as mathematical formulae. Mathematical formulae, in turn, may be expressed using an extensible computer language called MathML. Expressing phylogenetic definitions in MathML requires the definition of some additional entities. Here I review the relevant mathematical and biological concepts, terms, and notations, and provide an overview of MathML. I correlate these concepts to each other and, finally, define the entities needed to express phylogenetic definitions in MathML.
Table of Contents
Introduction
A General Review of Mathematical Concepts, Terms, and Notation
Some terminology and notation varies across different contexts. Where possible, I have followed MathML's terminology and default notation. Some exceptions have been made for certain logical symbols which are more easily read as words than as symbols, e.g., and instead of ∧, for all instead of ∀, etc.
The symbol := means "is defined as".
Collections
A collection is an entity which consists of zero or more distinct objects. Objects in a collection are members of the collection. A collection with one member is a singleton.
Sets
A set is an unordered collection. When an object, x, is a member of a set, S, this is denoted x ∈ S. Sets may themselves be members of other sets. Sets may be denoted in the following ways:
- Extensionally, as a list of objects: {x, y, z}
- Intensionally, with a rule that determines membership: {x | x > 1}, {x | x exhibits a cellular nucleus}
- Using a defined symbol, name, or phrase: ∅, U, Mammalia, YPM-VP 1450
The empty set is the set which includes no members, denoted as ∅.
The symbol ℝ indicates the set of all real numbers. The nonnegative real numbers are indicated as ℝ0+.
The cardinality of a set is the number of members in that set, denoted with enclosing vertical bars: |S|.
Examples. |∅| = 0. |{1, 2, 3}| = 3.
If all members of a set, A, are members of a set, B, then A is a subset of B, denoted A ⊆ B. B is a superset of A. Note that all sets are subsets and supersets of themselves. If A ⊆ B and A ≠ B, then A is a proper subset of B, denoted A ⊂ B, and B is a proper superset of A. Note that ∅ is a subset of all sets.
The operations of union, intersection, and difference may be applied to sets:
- Union. A ∪ B := {x | x ∈ A or x ∈ B}
- Intersection. A ∩ B := {x | x ∈ A and x ∈ B}
- Difference. A − B := {x ∈ A | x ∉ B}
Examples. If A = {1, 2} and B = {2, 3}, then A ∪ B = {1, 2, 3}, A ∩ B = {2}, and A − B = {1}.
A partition of a set, S, is a set of subsets of S, so that no sets in the partition overlap and all members of S are members of some set in the partition. A partition, P1, is a refinement of another partition, P2, if every member of P1 is a subset of some member of P2. P1 is finer than P2, and P2 is coarser than P1. This is written P1 ≤ P2.
Example. If S = {1, 2, 3}, then the partitions of S are {∅, S}, {∅, {1}, {2, 3}}, {∅, {1, 2}, {3}}, {∅, {1, 3}, {2}}, and {∅, {1}, {2}, {3}}. The partition {∅, {1}, {2}, {3}} is a refinement of {∅, {1}, {2, 3}}, which is a refinement of {∅, S}.
The set consisting of all (relevant) objects is called the universal set. The difference of the universal set and a set, S, is the complement of S, denoted S.
A power set of a set, S, is the set of all subsets of S, denoted 2S.
Example. If S = {1, 2}, then 2S = {∅, {1}, {2}, {1, 2}}.
Lists
A list is an ordered collection of elements, denoted as a series of elements within brackets, e.g., [x, y, z]. Unlike sets, lists may have the same member multiple times, e.g., [x, x, y]. A list with two members is an ordered pair. A list with three members is an ordered triple. A list with n members is an n-tuple. The nth member of a list, p, is denoted pn.
Example. If p = [x, y, z], then p1 = x, p2 = y, and p3 = z. The list, p, is an ordered triple (3-tuple).
A Cartesian product of two sets, A and B, is the set of all ordered pairs wherein the first element is a member of A and the second element is a member of B. This product is denoted as A × B. The product A × A may also be denoted as A2. Cartesian products may be generalized to cover any whole number, n, of sets, in which case the members of the product are n-tuples.
Example. {1, 2} × {3, 4} = {[1, 3], [1, 4], [2, 3], [2, 4]}.
Relations
A relation is a set of ordered pairs. If [x, y] is a pair in the relation R, then this is denoted as x R y. The first element in such a pair is the predecessor, and the second element is the successor. If x R y and x ≠ y, then x is a proper predecessor of y and y is a proper successor of x. If x R z and there is no other element, y, such that x R y and y R z, then x is an immediate predecessor of z and z is an immediate successor of x.
The expression R[x] denotes the set {y | x R y}. For example, >[0] indicates the set of all negative numbers.
A partial order is a relation with the properties of reflexivity, antisymmetry, and transitivity:
- Reflexivity. For all x, x R x.
- Antisymmetry. For all x, y, if x R y and x R y, then y R x.
- Transitivity. For all x, y, z, if x R y and y R z, then x R z.
The transitive closure of a relation, R, is the smallest (i.e., least inclusive) transitive relation that includes R.
If R is a partial order and x R y or y R x, then x and y are comparable. If all elements in a set can be compared to each other, then the set is a chain. If no two different elements in a set are comparable, then it is an antichain.
Graphs
A graph is an entity containing a set of objects, called vertices, and connections of the vertices, called edges. A graph may be defined as a type of ordered pair, in which the first element is the vertex set and the second element is the edge set: [V, E]. In an undirected graph, each edge is a set of two vertices, indicating that those vertices are connected, or incident. In a directed graph, or digraph, each edge is an ordered pair of vertices, indicating that the first element, or head, connects to the second element, or tail. Edges in directed graphs may also be called arcs.
A walk in a graph is a list of vertices in which each vertex is incident to the next vertex in the list. A path is a walk in a directed graph wherein some arc in the graph points from each vertex to the next vertex in the list. A cycle is a path which begins and ends with the same vertex. A directed graph is said to be acyclic if there are no cycles in it.
Functions
A function maps an element, called an argument, to a value. Formally, a function may be defined as an ordered triple of three sets: f := [X, Y, F]. The final set, F, is a set of ordered pairs, wherein the first element is the argument and the second element is the value. There may be only one ordered pair per argument. If an ordered pair, [x, y], is a member of F, this may be denoted as f(x) = y. If the argument is a list, instead of f([a1, a2, … an]), it is customary to simply write f(a1, a2, … an). Sometimes this may be written a1 f a2 f … an (infix notation).
The set including all of a function's arguments is the domain. All values of the function are within the codomain. The set of all values is the image, which is a subset of the codomain. If a function, f, has domain X and codomain Y, this is denoted as f: X → Y.
The composite of two functions, f and g, is a function which uses the value of g as an argument for f. Composition is written f ∘ g, so that (f ∘ g)(x) = f(g(x)). Note that the codomain of g must be a subset of the domain of f. If g : X → Y and f : Y → Z, then (f ∘ g) : X → Z.
A metric on a set, X, is a function with X2 as its domain and the set of nonnegative real numbers as its codomain. It defines a metric distance between any two members of X. The ε-ball of x is the set of all elements less than a certain distance, ε, from x.
A General Review of Biological Concepts and Terms
Organisms
An organism is an individual living entity. Although in many cases it is clear what constitutes an individual, there are notable difficult cases. However, although the mathematical system used by Names on Nodes assumes that there are discrete, individual entities, the implemented algorithms need never deal with them directly. Thus there is no present need to clarify the term.
Taxonomy
A nonempty set of organisms is a taxon (plural: taxa). A taxon whose members are all within another taxon is a subtaxon of that other taxon. A taxon which includes all members of another taxon is a supertaxon of that taxon. The most inclusive taxon is the universal taxon, which includes all organisms. A taxonomy is a system for dividing a taxon into subtaxa.
A taxonomic name is a word or phrase which signifies a taxon. A nomenclatural code is a code of rules which governs taxonomic names.
Specimens
In addition to taxonomic names, taxa may also be referenced using specimens. A specimen is an object which has been catalogued as part of a specimen collection. A specimen collection is often indicated by an abbreviation of its name, specified within the context, e.g., "Yale Peabody Museum: Vertebrate Paleontology Collection" may be abbreviated as "YPM-VP". A specimen within a collection may be indicated by the collection's name or abbreviation followed by an identifier that is unique within the collection, e.g., YPM-VP 1450. A specimen may have multiple identifiers if it has been transferred from one collection to another. For example, AMNH 973 and CM 9380 are the same specimen. A specimen may represent no organisms (e.g., a mineralogical specimen), one organism (e.g., a fossil skeleton), or multiple organisms (e.g., a microbe slide).
Character States
Taxa may also be defined intensionally using a description of a necessary characteristic, that is, a character state. Organisms exhibiting the state are part of the taxon. Valid states must be discrete and absolute, that is, organisms cannot partially exhibit them.
Examples. "Cellular nucleus present" is a valid state, assuming that "cellular nucleus" has been defined in such a way that it is either present or not, never partially present. "Large leaf size" is not a valid state, since it is relative, not absolute.
Taxa may be defined using a collection of character states. If all states are required for membership, the taxon is monothetic. If at least one state is required, the taxon is polythetic.
Taxonomic names are not usually defined according to character states, but the taxa that the names signify may be diagnosed by character states. For example, the taxon referred to as "Eukaryota" is diagnosed by the presence of cellular nuclei, but "Eukaryota" is not necessarily defined by that character state.
Rank-Based Nomenclature
Taxonomic names may be loosely defined using rank-based definitions. A taxon is rank-defined by specifying one or more type specimens or a type subtaxon (generally, a type), which must be included in the taxon by definition, and a rank, which indicates relative inclusivity of the taxon. Commonly used ranks are, from least to most inclusive, species, genus, family, order, phylum (zoology) or division (other disciplines), and kingdom. Many others exist as well.
Note that rank-based definitions do not dictate any criteria for membership, apart from the requirement that the type (specimens or subtaxon) must represent included organisms.
If a taxonomic name, "X", is defined as having a taxon with name "Y" as its type, and "Y" is defined as having specimen Z as its type, then Z may be called the finest type of "X".
A rank-based code is a nomenclatural code which governs rank-based definitions. Currently there are four in effect:
| Name of Rank-Based Code | Abbreviation | Organisms Covered |
|---|---|---|
| International Code of Botanical Nomenclature | ICBN | plants, fungi, some other eukaryotes |
| International Code of Zoological Nomenclature | ICZN | animals, some other eukaryotes |
| International Code of Nomenclature of Prokaryotes | ICNP | prokaryotes |
| International Code of Nomenclature for Cultivated Plants | ICNCP | cultivated plants |
Examples. Under ICZN rules, the name "Tyrannosaurus rex" refers to a taxon of the species rank typified by CM 9380 (formerly AMNH 973). Therefore, Tyrannosaurus rex must include the organism represented by that specimen. The name "Tyrannosaurus" refers to a taxon of genus rank typified by Tyrannosaurus rex, so it must be a supertaxon of that species (and, by extension, it must include the organism represented by CM 9380). The name "Tyrannosauridae" refers to a taxon of family rank typified by Tyrannosaurus, so it must be a supertaxon of Tyrannosaurus.
Phylogeny
Every organism has one or more ancestors and/or one or more descendants. An immediate ancestor is a parent, and an immediate descendant is a child. The pattern of ancestry and descent among organisms is phylogeny.
Although a fuller correlation wil be made further on, I note here that the biological term "ancestor" correlates to the mathematical term "proper predecessor", and the biological term "descendant" correlates to the mathematical term "proper successor". Therefore, we may say that a predecessor of an organism is any ancestor of that organism, or that organism itself. Conversely, a successor of an organism is any descendant of that organism, or that organism itself. I also note that the terms "maximal" and "minimal" may be applied to organisms with regard to taxa. The minimal members of a taxon are those which are not descended from any other member. The maximal members of a taxon are those which are not ancestral to any other member.
The common predecessors of a taxon are all organisms which are predecessors of all members of that taxon. The common successors of a taxon are all organisms which are successors of all members of that taxon.
The exclusive predecessors of a taxon, A, with regard to another taxon, Z, are all common predecessors of A except any which are predecessors of any member of Z. A may be termed the internal taxon and B may be termed the external taxon
The apomorphic predecessors of a taxon, A, with regard to another (generally character-based) taxon, M, are all common predecessors of A which are also members of M. A may be termed the representative taxon and M may be termed the apomorphic taxon
Lineages
A lineage is a sequence of organisms wherein each organism is followed by one of its children.
A synapomorphic predecessor of a taxon, A, with regard to taxon M, is an apomorphic predecessor for which there is at least one lineage for every member of A satisfying the following conditions:
- The first member of the lineage (i.e., the ancestor of all other members) is the apomorphic predecessor.
- The last member of the lineage (i.e., the descendant of all other members) is the member of A.
- All members of the lineage are members of M.
Cladogens
A taxon which fulfills the following requirements is here termed a cladogen (new term; previously "cladogenetic set" in Keesey [2007]):
- No member of a cladogen can be ancestral to any other member.
- There must be at least one organism which is a common successor of all members of the cladogen.
All singleton taxa are cladogens, but cladogens may have larger numbers of members as well.
A node-based cladogen consists of the maximal common predecessors of a taxon.
A branch-based cladogen consists of the minimal exclusive predecessors of an internal taxon with regard to an external taxon.
An apomorphy-based cladogen consists of the minimal synapomorphic predecessors of a representative taxon with regard to an apomorphic taxon.
Clades
A clade is a taxon including all members of a cladogen and all descendants of all of those members. Clades are monophyletic; in fact, "clade" is a synonym of "monophyletic taxon".
A node-based clade consist of a node-based cladogen and all descendants of all of its members. A branch-based clade consist of a branch-based cladogen and all descendants of all of its members. An apomorphy-based clade consist of an apomorphy-based cladogen and all descendants of all of its members.
Non-Clades
If a taxon's minimal members form a cladogen, but the taxon does not include not all descendants of that cladogen, then it is paraphyletic. (Note that cladogens themselves are paraphyletic, with the exception of singleton cladogens wherein the single member has no descendants. For such singleton cladogens, the clade and cladogen are identical.)
If the maximal common predecessors (i.e., the members of the node-based cladogen) of a taxon are not included within that taxon, then that taxon is polyphyletic.
Phylogeny-Based Nomenclature
A taxon may be strictly defined according to phylogeny, using a phylogeny-based definition. Most commonly, the taxa defined are clades, but other types of taxa may also be phylogenetically defined.
A phylogeny-based code is one which governs phylogeny-based definitions. Currently there are no such codes in effect, but there is a draft of one called the International Code of Phylogenetic Nomenclature (or the PhyloCode, for short). This code is intended to go into effect in the next few years, exist alongside the rank-based codes, and govern clade names for all types of organisms.
A General Review of MathML and Its Foundational Technologies
Strings
A string is a sequence of characters. Strings which are meant to be interpreted by a computer are referred to as code. Literal strings are referred to as text. A string which identifies an object is a name.
URIs
A Uniform Resource Identifier, or URI, is a string identifying a resource on the Internet.
One of the most common types of URI is the Uniform Resource Locator, or URL, which specifies an address and a mechanism for retrieval.
For example, a URL identifying this document is http://namesonnodes.org/ns/math/2009/ (http is the retrieval mechanism, i.e., Hypertext Transfer Protocol, and namesonnodes.org/ns/math/2009 is the address).
Another type of URL is the Uniform Resource Name, or URN, which functions as a location-independent name. Many types of identification can be expressed as URNs. For example:
-
International Standard Book Numbers.
urn:isbn:00800694 -
Digital Object Identifiers.
urn:doi:10.1080/10635150500431221 -
Life Science Identifiers.
urn:lsid:ubio.org:namebank:109086 -
SHA-1 keys formed from raw data.
urn:sha1:7ba04f9b4289bf102e17854388108f9f6553ce5b
(Note: this usage is strictly informal, but widespread.)
A URN resolver translates URNs (i.e., names) into URLs (i.e., locations).
For more on URIs, see these official specifications:
Namespaces
Generally, a namespace is a set of names, called local names, each of which has a single meaning in the context of the namespace.
Namespaces are commonly identified using URIs, which then function as namespace identifiers.
For example, this document is associated with the Name on Nodes mathematical namespace, which may be identified using the URI http://namesonnodes.org/ns/math/2009.
In some contexts, a shorter identifier may be equated with a URI.
Note that taxonomic publications (including nomenclatural codes) and specimen collections may be considered types of namespaces, wherein taxonomic names and specimen identifiers, respectively, function as local names.
A qualified name is an expression joining a namespace identifier with a local name.
Different computer languages have different methods of joining these.
One convention is to use one or two colons (":").
For example, the qualified name urn:isbn:00800694::Pinus may be used informally to refer to the entity identified as "Pinus" by the International Code of Botanical Nomenclature (ISBN 0080-0694).
XML
Extensible Markup Language, or XML for short, is a specification for creating markup languages.
Text in XML may be surrounded with tags: an opening tag, of the form <tagName>, and a closing tag, of the form </tagName>, where tagName is the name of the tag.
For example, in the XML expression <sentence>Hello, world!</sentence>, the text "Hello, world!" has been marked up by sentence tags.
XML tags may also included nested tags, for example: <sentence><word>Hello</word>, <word>world</word>!</sentence>.
The entire stucture consisting of an opening tag, content, and a closing tag is an element.
Any elements within an element's content are child elements.
An element with no content, i.e., an empty element, may be written as a self-closing tag: <tagName/>.
Both opening and self-closing XML tags may be augmented with attributes, each of which pairs a name to a value: <tagName attrName="attrValue"/>.
An XML tag may have any number of attributes, as long as they all have different names.
Tag and attribute names may be qualified names. Consider the following XML code:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:m="http://www.w3.org/1998/Math/MathML">
<head>
<title>XML Namespaces Example<title>
</head>
<body>
<div>This is XHTML.<div>
<div>The following is MathML:<div>
<m:math>
<m:apply>
<m:sin/>
<m:ci>x</m:ci>
</m:apply>
</m:math>
</body>
</html>
In this example, the default namespace is identified by http://www.w3.org/1999/xhtml (which identifies the XHTML namespace), and a namespace identifier, m, is synonymized with http://www.w3.org/1998/Math/MathML (which identifies the MathML namespace).
Therefore, if a tag or attribute's name is unqualified, then it is interpreted as an XHTML name.
If a tag or attribute's name is qualified by the prefix "m:", then it is interpreted as a MathML name.
For more on XML and XML namespaces, see these official specifications:
MathML
Mathematical Markup Language, or MathML, is an XML language for expressing mathematical concepts. Elements in MathML are divided into two major groups: MathML-Presentation, which contains information on how to render expressions visually, and MathML-Content, which models mathematical entities. Name on Nodes uses a relevant subset of MathML-Content.
An important element in MathML is the apply element.
This indicates that the first child element is to be interpreted as an operation (i.e., a function, relation, etc.), and the subsequent child elements are to be used as arguments.
Example.
The MathML element <apply xmlns="http://www.w3.org/1998/Math/MathML"><sin/><cn>0</cn></apply> indicates that the sine function (sin) is to be applied to the constant number, 0 (zero).
Another important element is the csymbol element, which allows the creation of custom-defined mathematical entities.
This is commonly achieved through use of the csymbol element's definitionURL attribute.
In Names on Nodes, the value of the definitionURL attribute may be:
-
A qualified name identifying a taxonomic name, e.g., "
urn:isbn:00800694::Pinus". If the taxonomic name's definition has a type, the symbol is interpreted as indicating the finest type of the name. Otherwise, it is interpreted as indicating the corresponding taxon. - A URL identifying a definition in this document, e.g., "
http://namesonnodes.org/ns/math/2009#def-UniversalTaxon".
Example. The following MathML element indicates a node-based clade consisting of all successors of the maximal common predecessors of the types of botanical species Lycopodium clavatum, Huperzia selago, Isoëtes lacustris, and Selaginella apoda. (This is Cantino et al.'s [2007] definition of the clade name "Lycopodiophyta".)
<apply xmlns="http://www.w3.org/1998/Math/MathML">
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedClade"/>
<apply>
<union/>
<csymbol definitionURL="urn:isbn:00800694::Lycopodium+clavatum"/>
<csymbol definitionURL="urn:isbn:00800694::Huperzia+selago"/>
<csymbol definitionURL="urn:isbn:00800694::Isoetes+lacustris"/>
<csymbol definitionURL="urn:isbn:00800694::Selaginella+apoda"/>
</apply>
</apply>
On the Correlation of Biological and Mathematical Terms
Taxa and Sets
As mentioned, taxa are a type of set. Thus, the operations defined for sets may be employed with taxa. Let U be the universal taxon, the set that includes all organisms. To be a taxon, a set must be a nonempty subset of U.
A union of taxa, T1 ∪ T2 ∪ … ∪ Tn, constitutes a polythetic set. An intersection of taxa, T1 ∩ T2 ∩ … ∩ Tn, constitutes a monothetic set.
A rank-based taxonomy of a taxon, T, may be considered a series of partitions on T, wherein each partition corresponds to a rank. Partitions of lower ranks are refinements of partitions of higher ranks, e.g., a species-level partition is a refinement of a genus-level partition.
Ancestry and Precedence
Parenthood may be defined as an antisymmetric, nontransitive relation. Let the relation ⊲ := {[x, y] | x is a parent of y}. The expression x ⊲ y means that x is a parent, or immediate predecessor, of y. The inverse relation, ⊳, is childhood. The symmetric relation, ⋈, may be used thusly: x ⋈ y if and only if x ⊲ y or x ⊳ y. The expression x ⊴ y means that x is a parent of or equal to y, that is, x either immediately precedes or equals y. The expression ⊴[x] represents the set of x and all of its children (immediate successors).
Ancestry may be defined as the transitive closure of parenthood. Let the relation ≺ := {[x, y] | x is an ancestor of y}. The expression x ≺ y means that x is an ancestor, or proper predecessor, of y. The inverse relation, ≻, is descent. The expression x ≼ y means that x is an ancestor of or equal to y, that is, x is a predecessor of y. The expression ≼[x] represents the set of x and all of its successors. The expression ≽[x] represents the set of x and all of its predecessors.
Phylogeny and Graphs
Phylogeny may be represented as a directed, acyclic graph (which correlates to a partially ordered set), G⊲ := [U, ⊲]. In this digraph, organisms are the vertices. The arcs (directed edges) point from parents to their children, so that the head of each arc is a parent and the tail of each arc is a child.
A path in G⊲ represents a lineage from ancestor to descendant. An x–y path in G⊲ is a sequence of vertices (organisms), p, of length n such that x = p1 and y = pn and p1 ⊲ p2 ⊲ … ⊲ pn.
A cladogen is an antichain in G⊲ wherein all members share at least one common successor. As noted earlier, all singleton subsets of U are cladogens.
Relatedness may be represented as an undirected graph, G⋈ := [U, {{x, y} | x ⋈ y}]. If two organisms (vertices) in this graph are connected, then they are in some way related. (Note that all known organisms are considered to be related.)
Definitions of Mathematical Entities
The following information is given for each mathematical/biological entity defined in this document:
-
Definition URL.
The full, canonical location of the definition.
This is to be used as the
definitionURLattribute's value incsymbolelements. - Symbol. The symbol used for the entity in Names on Nodes.
- Class. The general class which this entity belongs to (set, graph, function).
- Definition. The definition (generally mathematical) of the entity.
- Implementation. A qualified name identifying the ActionScript class which implements the entity in Names on Nodes.
- Discussion. Further discussion of the entity.
- Example(s). Examples of how the entity may be used in MathML code.
Universal Taxon
| Definition URL | http://namesonnodes.org/ns/math/2009#def-UniversalTaxon |
|---|---|
| Symbol | U |
| Class | Set |
| Definition |
U is the set of all organisms, be they extinct, extant, or yet to be. |
| Implementation | org.namesonnodes.math.entities::Taxon.fromFinestNodes and org.namesonnodes.domain.nodes::NodeGraph.allFinestNodes |
| Discussion |
(See the discussion of universal taxon, above.) Operationally, Names on Nodes treats the smallest discernible taxa in a given context as individuals. The union of these least inclusive taxa is treated as the operational equivalent of U. |
| Example |
<apply xmlns="http://www.w3.org/1998/Math/MathML">
<card/>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/>
</apply>
This evaluates to the total number of organisms.
Obviously, this number is actually unknown.
Operationally, this expression evaluates to the union of all taxa being treated in the context.
|
Maximal
| Definition URL | http://namesonnodes.org/ns/math/2009#def-Maximal |
|---|---|
| Symbol | max |
| Class | Function |
| Definition |
max : 2U → 2U
|
| Implementation | org.namesonnodes.math.operations::Maximal |
| Example |
<apply xmlns="http://www.w3.org/1998/Math/MathML"> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Maximal"/> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/> </apply>(This evaluates to the set of all organisms with no descendants.) |
| Discussion |
The maximal members of a set are all members which are not strictly ancestral to any other member. The concept of "maximal" correlates to what some authors have termed "last", "latest", or "most recent", as in "most recent common ancestor". However, unlike those other terms, "maximal" is not tied to chronology; the maximal members of a set are not necessarily contemporaries. Other potential synonyms of "maximal" are "final", "terminal", or "leafmost". The symbol for this function is the same as that of a MathML function, |
Minimal
| Definition URL | http://namesonnodes.org/ns/math/2009#def-Minimal |
|---|---|
| Symbol | min |
| Class | Function |
| Definition |
min : 2U → 2U
|
| Implementation | org.namesonnodes.math.operations::Minimal |
| Example |
<apply xmlns="http://www.w3.org/1998/Math/MathML"> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Minimal"/> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-UniversalTaxon"/> </apply>(This evaluates to the set of all organisms with no ancestors, i.e., the original organism[s].) |
| Discussion |
The minimal members of a set are all members which are not strictly descended from any other member. The concept of "minimal" correlates to what some authors have termed "earliest", "first", or "least recent", as in "least recent common ancestor". However, unlike those other terms, "minimal" is not tied to chronology; the minimal members of a set are not necessarily contemporaries. Other potential synonyms of "minimal" are "initial" or "rootmost". The symbol for this function is the same as that of a MathML function, |
Predecessor Union
| Definition URL | http://namesonnodes.org/ns/math/2009#def-PredecessorUnion |
||||
|---|---|---|---|---|---|
| Symbol | prc∪ | ||||
| Class | Function | ||||
| Definition |
prc∪ : 2U → 2U prc∪(A) := {x ∈ U | for some y ∈ A, x ≼ y} or
|
||||
| Implementation | org.namesonnodes.math.operations::PredecessorUnion |
||||
| Example |
<apply xmlns="http://www.w3.org/1998/Math/MathML"> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-PredecessorUnion"/> <apply> <union/> <csymbol definitionURL="urn:isbn:0853010064::Homo+sapiens"/> <csymbol definitionURL="urn:isbn:00800694::Pinus+sylvestris"/> </apply> </apply>(This evaluates to a set including all humans, all Scots pines, all ancestors of all humans, and all ancestors of all Scots pines. This includes shared ancestors as well as unshared ancestors.) |
||||
| Discussion |
The predecessor union of a set of organisms includes all members of that set as well as all ancestors of all members of that set. The predecessor union of a set is always a superset of the predecessor intersection. |
Successor Union
| Definition URL | http://namesonnodes.org/ns/math/2009#def-SuccessorUnion |
||||
|---|---|---|---|---|---|
| Symbol | suc∪ | ||||
| Class | Function | ||||
| Definition |
suc∪ : 2U → 2U suc∪(A) := {x ∈ U | for some y ∈ A, x ≽ y} or
|
||||
| Implementation | org.namesonnodes.math.operations::SuccessorUnion |
||||
| Example |
<apply xmlns="http://www.w3.org/1998/Math/MathML"> <csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-SuccessorUnion"/> <apply> <union/> <csymbol definitionURL="urn:isbn:0853010064::Equus+ferus"/> <csymbol definitionURL="urn:isbn:0853010064::Equus+asinus"/> </apply> </apply>(This evaluates to a set including all horses [Equus ferus], donkeys [Equus asinus], mules [Equus asinus × ferus] and hinnies[Equus ferus × asinus].) |
||||
| Discussion |
The successor union of a set of organisms includes all members of that set as well as all descendants of all members of that set. The successor union of a set is always a superset of the successor intersection. |
Predecessor Intersection
| Definition URL | http://namesonnodes.org/ns/math/2009#def-PredecessorIntersection |
||||
|---|---|---|---|---|---|
| Symbol | prc∩ | ||||
| Class | Function | ||||
| Definition |
prc∩ : 2U → 2U prc∩(A) := {x ∈ U | for all y ∈ A, x ≼ y} or
|
||||
| Implementation | org.namesonnodes.math.operations::PredecessorIntersection |
||||
| Discussion |
Successor Intersection
| Definition URL | http://namesonnodes.org/ns/math/2009#def-SuccessorIntersection |
||||
|---|---|---|---|---|---|
| Symbol | suc∩ | ||||
| Class | Function | ||||
| Definition |
suc∩ : 2U → 2U suc∩(A) := {x ∈ U | for all y ∈ A, x ≽ y} or
|
||||
| Implementation | org.namesonnodes.math.operations::SuccessorIntersection |
||||
| Discussion |
Synapomorphic Predecessors
| Definition URL | http://namesonnodes.org/ns/math/2009#def-SynapomorphicPredecessors |
|---|---|
| Symbol | synprc |
| Class | Function |
| Definition |
synprc : 2U × 2U → 2U synprc(M, A) := {x ∈ prc∩(A) | for all y ∈ A, there exists some x–y path, p, in G⊲ where for all pn ∈ p, pn ∈ M} |
| Implementation | org.namesonnodes.math.operations::SynapomorphicPredecessors |
| Discussion |
Node-Based Cladogen
| Definition URL | http://namesonnodes.org/ns/math/2009#def-NodeBasedCladogen |
|---|---|
| Symbol | + |
| Class | Function |
| Definition |
+ : 2U → 2U + := max ∘ prc∩ A + B + ... + Z := (max ∘ prc∩)(A ∪ B ∪ ... ∪ Z) |
| Implementation | org.namesonnodes.math.operations::NodeBasedCladogen |
| Discussion |
The node-based cladogen of a set, A, consists of its maximal common predecessors. This is a similar concept to "most recent common ancestors". This operation has two forms of notation: 1) as a prefix; and 2) as an infix, which is shorthand for applying the function to a union of sets. If (and only if) A has no common predecessors, then +(A) = ∅ and A has no node-based cladogen. Since all known organisms are theorized to descend from common ancestors, node-based cladogens exist for all known taxa, in theory. |
Branch-Based Cladogen
| Definition URL | http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen |
|---|---|
| Symbol | ← |
| Class | Function |
| Definition |
← : 2U × 2U → 2U A ← Z := min(prc∩(A) − prc∪(Z)) |
| Implementation | org.namesonnodes.math.operations::BranchBasedCladogen |
| Discussion |
Specifying a branch-based cladogen requires two sets, one internal (A) and one external (Z). The exclusive common predecessors of the internal set are all of its common predecessors minus all predecessors of the external set. The branch-based predecessors are the minimal exclusive common ancestors of the internal set. If A has no common predecessors, or all of those common predecessors are also predecessors of Z, then A ← Z = ∅ and there is no branch-based cladogen for [A, Z]. |
Apomorphy-Based Cladogen
| Definition URL | http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedCladogen |
|---|---|
| Symbol | in |
| Class | Function |
| Definition |
in : 2U × 2U → 2U in := min ∘ synprc M in A := (min ∘ synprc)(M, A) |
| Implementation | org.namesonnodes.math.operations::ApomorphyBasedCladogen |
| Discussion |
Specifying an apomorphy-based cladogen requires two sets, one apomorphic (M) and the other representative (A). These two sets indicate synapmorphic predecessors. The apomorph-based cladogens consists of the minimal synapomorphic predecessors. If A ⊈ M, then M in A = ∅ and there is no apomorphy-based cladogen for [M, A]. There is also no apomorphy-based cladogen if at least two members of A are in M due to convergence, i.e., if there are no synapomorphic predecessors. |
Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-Clade |
||||
|---|---|---|---|---|---|
| Symbol | Clade | ||||
| Class | Function | ||||
| Definition |
Clade : 2U → 2U
|
||||
| Implementation | org.namesonnodes.math.operations::Clade |
||||
| Discussion | TBD |
Node-Based Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-NodeBasedClade |
|---|---|
| Symbol | Clade+ |
| Class | Function |
| Definition |
Clade+ : 2U → 2U Clade+ := Clade ∘ + |
| Implementation | org.namesonnodes.math.operations::NodeBasedClade |
| Discussion | TBD |
Branch-Based Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-BranchBasedClade |
|---|---|
| Symbol | Clade← |
| Class | Function |
| Definition |
Clade← : 2U × 2U → 2U Clade← := Clade ∘ ← Clade←(A, Z) = suc∪(prc∩(A) − prc∪(Z)) |
| Implementation | org.namesonnodes.math.operations::BranchBasedClade |
| Discussion | TBD |
Apomorphy-Based Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedClade |
|---|---|
| Symbol | Cladein |
| Class | Function |
| Definition |
Cladein : 2U × 2U → 2U Cladein := Clade ∘ in |
| Implementation | org.namesonnodes.math.operations::ApomorphyBasedClade |
| Discussion | TBD |
Crown Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-CrownClade |
|---|---|
| Symbol | Crown |
| Class | Function |
| Definition |
Crown : 2U × 2U → 2U Crown(A, E) := Clade+(suc∪(A) ∩ E) |
| Implementation | org.namesonnodes.math.operations::CrownClade |
| Discussion | TBD |
Total Clade
| Definition URL | http://namesonnodes.org/ns/math/2009#def-TotalClade |
|---|---|
| Symbol | Total |
| Class | Function |
| Definition |
Total : 2U × 2U → 2U Let C = Crown(A, E) Total(A, E) := Clade(C ← E − C) |
| Implementation | org.namesonnodes.math.operations::TotalClade |
| Discussion | TBD |
Appendix I.—Implemented MathML-Content Elements
MathML-Content provides methods for modelling a wide variety of mathematical entities. Since Names on Nodes only deals with logic and set theory, only certain elements have been implemented. The following is a list of all MathML-Content element which have been implemented in Names on Nodes, with notes as necessary. Other elements may work, but are not guaranteed.
and
|
|
|---|---|
apply
|
|
ci
|
The |
csymbol
|
The
The |
declare
|
The |
emptyset
|
May be used wherever taxa may be used. |
eq
|
|
false
|
|
implies
|
|
intersect
|
|
math
|
The last child element of a |
neq
|
|
not
|
|
notprsubset
|
|
notsubset
|
|
or
|
|
otherwise
|
|
piece
|
|
piecewise
|
The |
prsubset
|
|
setdiff
|
|
subset
|
|
true
|
|
union
|
|
xor
|
Appendix II.—Example Definitions
| Taxon Name | Dinosauria |
|---|---|
| Definition Type | Node-Based |
| Authorities Referenced | The International Code of Zoological Nomenclature, Fourth Edition (ISBN: 0-85301-0006-4) |
| Prose | All successors of the maximal common predecessors of Megalosaurus bucklandii Mantell 1827, Iguanodon bernissartensis Boulenger in Beneden 1881, and Hylaeosaurus armatus Mantell 1833. |
| Mathematical Formulae |
Dinosauria := Clade+(Megalosaurus bucklandii ∪ Iguanodon bernissartensis ∪ Hylaeosaurus armatus). Or:Dinosauria := Clade(Megalosaurus bucklandii + Iguanodon bernissartensis + Hylaeosaurus armatus). |
| MathML Formulae | <math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedClade">
<csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
<csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
<csymbol definitionURL="urn:isbn:0853010064::Hylaeosaurus+armatus"/>
</apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-NodeBasedCladogen">
<csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
<csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
<csymbol definitionURL="urn:isbn:0853010064::Hylaeosaurus+armatus"/>
</apply>
</apply>
</math>
|
| Taxon Name | Saurischia |
|---|---|
| Definition Type | Branch-Based |
| Authorities Referenced | The International Code of Zoological Nomenclature, Fourth Edition (ISBN: 0-85301-0006-4) |
| Prose | All successors of the minimal common predecessors of Megalosaurus bucklandii Mantell 1827 exclusive of all predecessors of Iguanodon bernissartensis Boulenger in Beneden 1881. |
| Mathematical Formulae |
Saurischia := Clade←(Megalosaurus bucklandii, Iguanodon bernissartensis). Or:Saurischia := Clade(Megalosaurus bucklandii ← Iguanodon bernissartensis). |
| MathML Formulae | <math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedClade">
<csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
<csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
</apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
<csymbol definitionURL="urn:isbn:0853010064::Megalosaurus+bucklandii"/>
<csymbol definitionURL="urn:isbn:0853010064::Iguanodon+bernissartensis"/>
</apply>
</apply>
</math> |
| Taxon Name | Avialae |
|---|---|
| Definition Type | Apomorphy-Based |
| Authorities Referenced |
|
| Prose | All successors of the minimal predecessors of Vultur gryphus Linnaeus 1758 to share the synapomorphy of wings used for powered flight (Gauthier and de Queiroz 2001). |
| Mathematical Formulae |
Avialae := Cladein("wings used for powered flight", Vultur gryphus). Or:Avialae := Clade("wings used for powered flight" in Vultur gryphus). |
| MathML Formulae | <math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedClade">
<csymbol definitionURL="urn:bici:0912532572(200112)%3C7:FDFDCD%3E2.0.TX;2-H
::wings+used+for+powered+flight"/>
<csymbol definitionURL="urn:isbn:0853010064::Vultur+gryphus"/>
</apply>
</math>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-Clade">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-ApomorphyBasedCladogen">
<csymbol definitionURL="urn:bici:0912532572(200112)%3C7:FDFDCD%3E2.0.TX;2-H
::wings+used+for+powered+flight"/>
<csymbol definitionURL="urn:isbn:0853010064::Vultur+gryphus"/>
</apply>
</apply>
</math> |
| Taxon Name | Tracheophyta |
|---|---|
| Definition Type | Branch-Modified Node-Based |
| Authorities Referenced |
|
| Prose |
All successors of the maximal common predecessors of all extant (Cantino & al. 2007) successors of the minimal predecessors of Zea mays L. 1753 exclusive of all predecessors of Phaeoceros laevis (L.) Prosk. 1951, Marchantia polymorpha L. 1753, and Polytrichum commune Hedw. 1801. |
| Mathematical Formula |
Tracheophyta := Crown(Zea mays ← Phaeoceros laevis ∪ Marchantia polymorpha ∪ Polytricha commune, "extant") . |
| MathML Formula | <math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-CrownClade">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
<csymbol definitionURL="urn:isbn:3906166481::Zea+mays">
<apply>
<union/>
<csymbol definitionURL="urn:isbn:3906166481::Phaeoceros+laevis">
<csymbol definitionURL="urn:isbn:3906166481::Marchantia+polymorpha">
<csymbol definitionURL="urn:isbn:3906166481::Polytrichum+commune">
</apply>
</apply>
<csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
</apply>
</math> |
| Taxon Name | Pan-Tracheophyta |
|---|---|
| Definition Type | Total |
| Authorities Referenced |
|
| Prose |
All successors of the minimal predecessors of Tracheophyta exclusive of all predecessors of extant (Cantino & al. 2007) non-tracheophytes. The total clade of Tracheophyta. |
| Mathematical Formula |
Let Tracheophyta := Crown(Zea mays ← Phaeoceros laevis ∪ Marchantia polymorpha ∪ Polytricha commune, "extant"). Pan-Tracheophyta := Total(Tracheophyta, "extant"). |
| MathML Formula | <math xmlns="http://www.w3.org/1998/Math/MathML">
<declare type="set">
<ci>Tracheophyta</ci>
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-CrownClade">
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-BranchBasedCladogen">
<csymbol definitionURL="urn:isbn:3906166481::Zea+mays">
<apply>
<union/>
<csymbol definitionURL="urn:isbn:3906166481::Phaeoceros+laevis">
<csymbol definitionURL="urn:isbn:3906166481::Marchantia+polymorpha">
<csymbol definitionURL="urn:isbn:3906166481::Polytrichum+commune">
</apply>
</apply>
<csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
</apply>
</declare>
<apply>
<csymbol definitionURL="http://namesonnodes.org/ns/math/2009#def-TotalClade">
<ci>Tracheophyta</ci>
<csymbol definitionURL="urn:sici:0040-0262(200708)56:3%3C822:TAPNOT%3E2.0.TX;2-#::extant">
</apply>
</math> |