diving bird phylogeny

How to create an ultrametric dichotomous phylogeny in R

If you’re doing comparisons between species, you might be interested in plotting those species on a phylogenetic tree. Once you’ve created the tree, you might want to do some statistics. Generally, you’ll need a phylogeny that is both ultrametric and dichtomous to do these stats. Here’s one way to create this kind of tree in R.

Start by downloading a phylogenetic subset of your species at www.vertlife.org or www.birdtree.org. (Sorry invertebrate and plant folks–you’ll have to look elsewhere.) Make sure you have all of your species names correct–if they’re mismatched, you won’t get what you’re looking for and it doesn’t tell you. I usually go with a 1000-tree subset. You should save it as a .nex (nexus) file.

In R, you’ll need the following packages:

install.packages("ape")
install.packages("phytools")
install.packages("picante")

Once you’ve got them installed, you can call them at the beginning of each work session:

library(ape)
library(phytools)
library(picante)

Now read in your tree (here we’re using the read.nexus function from the APE package):

treefile <-read.nexus('output.nex')

Next, you need to produce a single tree from the 1000 trees you downloaded from vertlife.org. To do this, we’ll use a package from phytools to build a majority-rules consensus. Learn more here: http://blog.phytools.org/2016/03/method-to-compute-consensus-edge.html

#Create a consensus tree. Note: p=0.5 specifies that the tree must be "majority rules consensus (MRC)".
      consensus<-consensus.edges(treefile, consensus.tree=consensus(treefile,p=0.5))  
      plotTree(consensus, fsize=0.6)

Congrats! You’ve made a phylogeny. There were some bugs for me, so I had to make a few modifications…. Your mileage may vary. This next part is only useful if your edge lengths are off.

print(consensus$edge.length) 
#how does it look?  too many zeros at the end?  There appears to be a bug.  If so, you will need to run the code below....

#This makes a single-number vector that represents the length of the consensus$edge.length
n <- length(consensus$edge.length)  
  
#need to reduce the number of elements by 1... We also add a very small amount to make sure no branch length is zero.    
consensus$edge.length <- consensus$edge.length[1:n-1] + .0000001  
print(consensus$edge.length) 

#how does it look now?  It should not have a zero for the last row.  Plot to confirm it works.
plotTree(consensus, fsize=0.6)

Great, now you need to convert your consensus tree into an ultrametric, dichotomous tree. PhylANOVA and PIC assume your tree will be both of these. What is “ultrametric?” Good question. It’s a tree where all of the path lengths from the root to the tips are equal. Dichotomy is another assumption of the statistical tests you may be interested in performing later.

First, we’ll make the tree ultrametric using the chronos function in APE.

 consensus_ultra=chronos(consensus, lambda=0)  
# Lambda is the rate-smoothing parameter.  Generally, we want lambda to be small, so there is very little smoothing  See here (citation at bottom): http://www.justinbagley.org/1226/update-new-functions-for-generating-starting-trees-for-beast-or-starbeast-in-r

#Check again; now it's ultrametric.
is.ultrametric(consensus_ultra) 

Now let’s make it dichotomous using multi2di from APE.

 #Make the tree dichotomous (so each node splits into only two branches)   
tree.unmatched <- multi2di(consensus_ultra, random=TRUE) 

#Let's plot our new tree.  Huzzah!

plotTree(tree.unmatched,fsize=0.6) 

Here’s an example of what it may look like:

Here’s the actual code with a few more comments as well as the sample nexus file for the tree above:

8 comments

  1. Nick

    It’s too bad that chronos() doesn’t work for some trees (even in ape 5.4-1):

    “`
    Setting initial dates…
    Fitting in progress… get a first set of estimates
    Error in nlminb(start.para, f, g, control = opt.ctrl, lower = LOW, upper = UP): NA/NaN gradient evaluation
    Traceback:

    1. to_ultrametric(trees_f[[2]])
    2. tree %>% multi2di %>% compute.brlen(1) %>% chronos(lambda = 0) # at line 3-6 of file
    3. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
    4. eval(quote(`_fseq`(`_lhs`)), env, env)
    5. eval(quote(`_fseq`(`_lhs`)), env, env)
    6. `_fseq`(`_lhs`)
    7. freduce(value, `_function_list`)
    8. withVisible(function_list[[k]](value))
    9. function_list[[k]](value)
    10. chronos(., lambda = 0)
    11. nlminb(start.para, f, g, control = opt.ctrl, lower = LOW, upper = UP)
    “`

    Note: my tree is not missing any branch lengths. Length distribution:

    “`
    Min. 1st Qu. Median Mean 3rd Qu. Max.
    0.00000 0.01630 0.04543 0.06978 0.10417 0.50000
    “`

      1. Daniel Padfield

        Hi everyone

        So I have been having this error message on and off when trying to use chronos() or chronopl(). It has been driving me insane and I could not solve it.

        I was rooting the tree using FigTree and then exporting the tree to make ultrametric in R. Turns out what solved it for me (fingers crossed for the other instances) is saving in NEXUS format and with “Save as currently displayed” and “Include Annotations” ticked.

        Hope this helps as it has been driving me up the wall on and off for ages.

        1. bkvo

          Great idea. If you experience more issues like this, I recommend looking at tree data that works with chronos and then changing the bad tree to match. It’s often just an issue with formatting or extra zeros at the end of a column, etc.

  2. Gem

    Thank you so much, this was super clear and easy to follow. You saved me from sending a panicked email to my supervisor!

Leave a Reply

Your email address will not be published. Required fields are marked *