Names on Nodes: Phylogenetic Query Script (Rough Draft)
P.O. Box 292304 Los Angeles, CA, USA 90027; keesey@gmail.com
Abstract
The MathML Definitions document shows how MathML may be used to model phylogenetic hypotheses and phylogenetic definitions. Since MathML is verbose, it may be preferable in some instance to have a more succinct scripting language with the same functionality. I have created a plain text version of the mathematical markup specified by the MathML Definitions document.
Table of Contents
Operators and Identifiers 

Examples 
Operators and Identifiers
Category  Description  Formula  Notes 

General  equality  entity1 = entity2 
It may be desirable to use == instead (as in C and many other computer languages). 
inequality  entity1 != entity2 
Also used for exclusive disjunction ("xor"). This operator is borrowed from C.  
clause  (entity) 

conditional statement  proposition ? entity1 : entity2 
Evaluates to entity1 if proposition is true, or entity2 if proposition is false.
This operator is borrowed from C.


Constants  constant name  "name" 
Internal quotes may be "escaped", e.g., "\"Iguanodon\" hoggi" .
Possibly single quotes (' ) should be allowed as well, or no quotes for names without whitespace.

declaration  "name" := entity. 

integer  digits 
Base 10. Nonintegers and negative numbers are not required, so no method is provided for denoting them.  
Set Theory  extensional set  {entity1, entity2 …} 

empty set  {} 

union  set1  set2 … 
The character ∪ would be preferable, but it is not an ASCII character.  
intersection  set1 & set2 … 
The character ∩ would be preferable, but it is not an ASCII character.  
difference  set1  set2 
Some mathematical texts use "\", so this may be preferable.  
set membership  entity in set 
The character ∈ would be preferable, but it is not an ASCII character.  
subset  set1 <= set2 
The character ⊆ would be preferable, but it is not an ASCII character.  
proper subset  set1 < set2 
The character ⊂ (or ⊊) would be preferable, but it is not an ASCII character.  
superset  set1 >= set2 
The character ⊇ would be preferable, but it is not an ASCII character.  
proper superset  set1 > set2 
The character ⊃ (or ⊋) would be preferable, but it is not an ASCII character.  
Ordered Lists  extensional list  [entity1, entity 2 …] 

list element selector  list_index 
This notation is somewhat unusual.
Other languages use brackets (list[index] ), but using underscores allows a clearer distinction between element selection and extensional declaration (previous item), and relate better to common mathematical notation (which uses subscripts).


Boolean Logic  true  true 

false  false 

negation  !proposition 
This operator is borrowed from C.
The character ¬ would be preferable, but it is not an ASCII character.
(Possibly not should be used or allowed?)


conjunction  proposition1 && proposition2 
This operator is borrowed from C.
The character ∧ would be preferable, but it is not an ASCII character.
(Possibly and should be used or allowed?)


disjunction (inclusive)  proposition1  proposition2 
This operator is borrowed from C.
The character ∨ would be preferable, but it is not an ASCII character.
(Possibly and should be used or allowed?)


Functions  application  function(entity1, entity2 …) 

composition  function1 * function2 
This is an unorthodox usage of this character.
The character ∘ would be preferable, but it is not an ASCII character.


Phylogeny  phylogenetic graph  P 

universal taxon  U 

maximal members  max(set) 

minimal members  min(set) 

predecessor union  prc(set) 

predecessor intersection  prc&(set) 

successor union  suc(set) 

successor intersection  suc&(set) 

exclusive predecessors  set1 < set2 
set1 is the internal set; set2 is the external set. 

synapomorphic predecessors  set1 @ set2 
set1 is the apomorphic set; set2 is the representative set. 

clade  clade(set) 
If the minimal members of set1 form a cladogen (a clade ancestor), then this is equivalent to suc(set) .
Otherwise, it is equivalent to (suc * max * prc&)(set) .


nodebased clade  clade(set1  set2 …)or (suc * max * prc&)(set1  set2 …) 

branchbased clade  clade(set1 < set2)or suc(set1 < set2) 
set1 is the internal set; set2 is the external set. 

apomorphybased clade  clade(set1 @ set2)or suc(set1 @ set2) 
set1 is the apomorphic set; set2 is the representative set. 

crown clade  crown(set1, set2) 
set1 is the bounding set; set2 is the set of extant organisms. 

total clade  total(set1, set2) 
set1 is the internal set; set2 is the set of extant organisms. 
Examples
Formula  Prose or Diagram  Notes 

P := [ { "Aves*", "Palaeognathae*", "Struthio camelus", "Tetrao major", "Vultur gryphus" }, { ["Aves*", "Vultur gryphus"], ["Aves*", "Palaeognathae*"], ["Palaeognathae*", "Struthio camelus"], ["Palaeognathae*", "Tetrao major"] } ]. 

This defines a simple phylogenetic context (a directed, acyclic graph where vertices are taxonomic units and arcs represent immediate descent). 
"Tinamus major" := "Tetrao major". 
Tinamus major is Tetrao major.  These are objective synonyms under the zoological code. 
"Aves" := clade("Struthio camelus"  "Tetrao major"  "Vultur gryphus"). 
Aves is all successors of the maximal common predecessors of Struthio camelus, Tetrao major, and Vultur gryphus.  
"Saurischia" := clade("Megalosaurus bucklandii" < "Iguanodon bernissartensis"). 
Saurischia is all successors of the (common) predecessors of Megalosaurus bucklandii exclusive of all predecessors of Iguanodon bernissartensis.  
"Avialae" := clade("wings used for powered flight" @ "Vultur gryphus"). 
Avialae is all successors of the predecessors of Vultur gryphus to share wings used for powered flight synapomorphically with Vultur gryphus.  
"Aves" = crown("Avialae", "extant") = crown("Saurischia", "extant") 
Aves is equivalent to the avialan crown clade and the saurischian crown clade.  
"PanAves" := total("Aves", "extant"). 
PanAves is the avian total clade.  
"Avemetatarsalia" := clade("Aves" < "Crocodylus niloticus"). 
Avemetatarsalia is all successors of the (common) predecessors of Aves exclusive of all predecessors of Crocodylus niloticus.  
"PanAves" = "Avemetatarsalia" 
PanAves is equivalent to Avemetatarsalia.  
"Ichthyornithes" := clade("YPMVP 1450" < "Struthio camelus"  "Tetrao major"  "Vultur gryphus"). 
Ichthyornithes is all successors of the (common) predecessors of the organism represented by YPMVP 1450 exclusive of all predecessors of Struthio camelus, Tetrao major, and/or Vultur gryphus.  YPMVP 1450 is the Ichthyornis dispar holotype specimen. 
"Ichthyornis" := clade("Ichthyornithes" & ("apomorphy 2"  "apomorphy 5"  "apomorphy 6"  "apomorphy 7"  "apomorphy 8" @ "YPMVP 1450")). 
Ichthyornis is all successors of all ichthyornithean predecessors of the organism represented by YPMVP 1450 to share apomorphies 2, 5, 6, 7, and 8 synapomorphically with the organism represented by YPMVP 1450.  The numbers refer to apomorphies described by Clarke (2004). 
"PanBiota" := (clade * prc&)("Homo sapiens"). 
PanBiota is all successors of all (common) predecessors of Homo sapiens.  
"Biota" := crown("PanBiota", "extant"). 
Biota is all successors of the maximal common predecessors of all extant members of PanBiota.  
"S" := "Otaria byronia"  "Odobenus rosmarus"  "Phoca vitulina". "Pinnipedia" := (max * prc&)("S") <= ("flippers" @ "S") ? clade("S") : {}. 
If the maximal common predecessors of the specifiers (Otaria byronia, Odobenus rosmarus, and Phoca vitulina) possessed flippers synapomorphic with those of the specifiers, then Pinnipedia is all successors of the maximal common predecessors of the specifiers. Otherwise, Pinnipedia is empty. 