Names on Nodes: Phylogenetic Query Script (Rough Draft)
P.O. Box 292304 Los Angeles, CA, USA 90027; keesey@gmail.com
Abstract
The MathML Definitions document shows how MathML may be used to model phylogenetic hypotheses and phylogenetic definitions. Since MathML is verbose, it may be preferable in some instance to have a more succinct scripting language with the same functionality. I have created a plain text version of the mathematical markup specified by the MathML Definitions document.
Table of Contents
Operators and Identifiers |
---|
Examples |
Operators and Identifiers
Category | Description | Formula | Notes |
---|---|---|---|
General | equality | entity1 = entity2 |
It may be desirable to use == instead (as in C and many other computer languages). |
inequality | entity1 != entity2 |
Also used for exclusive disjunction ("xor"). This operator is borrowed from C. | |
clause | (entity) |
||
conditional statement | proposition ? entity1 : entity2 |
Evaluates to entity1 if proposition is true, or entity2 if proposition is false.
This operator is borrowed from C.
|
|
Constants | constant name | "name" |
Internal quotes may be "escaped", e.g., "\"Iguanodon\" hoggi" .
Possibly single quotes (' ) should be allowed as well, or no quotes for names without whitespace.
|
declaration | "name" := entity. |
||
integer | digits |
Base 10. Non-integers and negative numbers are not required, so no method is provided for denoting them. | |
Set Theory | extensional set | {entity1, entity2 …} |
|
empty set | {} |
||
union | set1 | set2 … |
The character ∪ would be preferable, but it is not an ASCII character. | |
intersection | set1 & set2 … |
The character ∩ would be preferable, but it is not an ASCII character. | |
difference | set1 - set2 |
Some mathematical texts use "\", so this may be preferable. | |
set membership | entity in set |
The character ∈ would be preferable, but it is not an ASCII character. | |
subset | set1 <= set2 |
The character ⊆ would be preferable, but it is not an ASCII character. | |
proper subset | set1 < set2 |
The character ⊂ (or ⊊) would be preferable, but it is not an ASCII character. | |
superset | set1 >= set2 |
The character ⊇ would be preferable, but it is not an ASCII character. | |
proper superset | set1 > set2 |
The character ⊃ (or ⊋) would be preferable, but it is not an ASCII character. | |
Ordered Lists | extensional list | [entity1, entity 2 …] |
|
list element selector | list_index |
This notation is somewhat unusual.
Other languages use brackets (list[index] ), but using underscores allows a clearer distinction between element selection and extensional declaration (previous item), and relate better to common mathematical notation (which uses subscripts).
|
|
Boolean Logic | true | true |
|
false | false |
||
negation | !proposition |
This operator is borrowed from C.
The character ¬ would be preferable, but it is not an ASCII character.
(Possibly not should be used or allowed?)
|
|
conjunction | proposition1 && proposition2 |
This operator is borrowed from C.
The character ∧ would be preferable, but it is not an ASCII character.
(Possibly and should be used or allowed?)
|
|
disjunction (inclusive) | proposition1 || proposition2 |
This operator is borrowed from C.
The character ∨ would be preferable, but it is not an ASCII character.
(Possibly and should be used or allowed?)
|
|
Functions | application | function(entity1, entity2 …) |
|
composition | function1 * function2 |
This is an unorthodox usage of this character.
The character ∘ would be preferable, but it is not an ASCII character.
|
|
Phylogeny | phylogenetic graph | P |
|
universal taxon | U |
||
maximal members | max(set) |
||
minimal members | min(set) |
||
predecessor union | prc|(set) |
||
predecessor intersection | prc&(set) |
||
successor union | suc|(set) |
||
successor intersection | suc&(set) |
||
exclusive predecessors | set1 <- set2 |
set1 is the internal set; set2 is the external set. |
|
synapomorphic predecessors | set1 @ set2 |
set1 is the apomorphic set; set2 is the representative set. |
|
clade | clade(set) |
If the minimal members of set1 form a cladogen (a clade ancestor), then this is equivalent to suc|(set) .
Otherwise, it is equivalent to (suc| * max * prc&)(set) .
|
|
node-based clade | clade(set1 | set2 …)or (suc| * max * prc&)(set1 | set2 …) |
||
branch-based clade | clade(set1 <- set2)or suc|(set1 <- set2) |
set1 is the internal set; set2 is the external set. |
|
apomorphy-based clade | clade(set1 @ set2)or suc|(set1 @ set2) |
set1 is the apomorphic set; set2 is the representative set. |
|
crown clade | crown(set1, set2) |
set1 is the bounding set; set2 is the set of extant organisms. |
|
total clade | total(set1, set2) |
set1 is the internal set; set2 is the set of extant organisms. |
Examples
Formula | Prose or Diagram | Notes |
---|---|---|
P := [ { "Aves*", "Palaeognathae*", "Struthio camelus", "Tetrao major", "Vultur gryphus" }, { ["Aves*", "Vultur gryphus"], ["Aves*", "Palaeognathae*"], ["Palaeognathae*", "Struthio camelus"], ["Palaeognathae*", "Tetrao major"] } ]. |
|
This defines a simple phylogenetic context (a directed, acyclic graph where vertices are taxonomic units and arcs represent immediate descent). |
"Tinamus major" := "Tetrao major". |
Tinamus major is Tetrao major. | These are objective synonyms under the zoological code. |
"Aves" := clade("Struthio camelus" | "Tetrao major" | "Vultur gryphus"). |
Aves is all successors of the maximal common predecessors of Struthio camelus, Tetrao major, and Vultur gryphus. | |
"Saurischia" := clade("Megalosaurus bucklandii" <- "Iguanodon bernissartensis"). |
Saurischia is all successors of the (common) predecessors of Megalosaurus bucklandii exclusive of all predecessors of Iguanodon bernissartensis. | |
"Avialae" := clade("wings used for powered flight" @ "Vultur gryphus"). |
Avialae is all successors of the predecessors of Vultur gryphus to share wings used for powered flight synapomorphically with Vultur gryphus. | |
"Aves" = crown("Avialae", "extant") = crown("Saurischia", "extant") |
Aves is equivalent to the avialan crown clade and the saurischian crown clade. | |
"Pan-Aves" := total("Aves", "extant"). |
Pan-Aves is the avian total clade. | |
"Avemetatarsalia" := clade("Aves" <- "Crocodylus niloticus"). |
Avemetatarsalia is all successors of the (common) predecessors of Aves exclusive of all predecessors of Crocodylus niloticus. | |
"Pan-Aves" = "Avemetatarsalia" |
Pan-Aves is equivalent to Avemetatarsalia. | |
"Ichthyornithes" := clade("YPM-VP 1450" <- "Struthio camelus" | "Tetrao major" | "Vultur gryphus"). |
Ichthyornithes is all successors of the (common) predecessors of the organism represented by YPM-VP 1450 exclusive of all predecessors of Struthio camelus, Tetrao major, and/or Vultur gryphus. | YPM-VP 1450 is the Ichthyornis dispar holotype specimen. |
"Ichthyornis" := clade("Ichthyornithes" & ("apomorphy 2" | "apomorphy 5" | "apomorphy 6" | "apomorphy 7" | "apomorphy 8" @ "YPM-VP 1450")). |
Ichthyornis is all successors of all ichthyornithean predecessors of the organism represented by YPM-VP 1450 to share apomorphies 2, 5, 6, 7, and 8 synapomorphically with the organism represented by YPM-VP 1450. | The numbers refer to apomorphies described by Clarke (2004). |
"Pan-Biota" := (clade * prc&)("Homo sapiens"). |
Pan-Biota is all successors of all (common) predecessors of Homo sapiens. | |
"Biota" := crown("Pan-Biota", "extant"). |
Biota is all successors of the maximal common predecessors of all extant members of Pan-Biota. | |
"S" := "Otaria byronia" | "Odobenus rosmarus" | "Phoca vitulina". "Pinnipedia" := (max * prc&)("S") <= ("flippers" @ "S") ? clade("S") : {}. |
If the maximal common predecessors of the specifiers (Otaria byronia, Odobenus rosmarus, and Phoca vitulina) possessed flippers synapomorphic with those of the specifiers, then Pinnipedia is all successors of the maximal common predecessors of the specifiers. Otherwise, Pinnipedia is empty. |