If you’re doing comparisons between species, you might be interested in plotting those species on a phylogenetic tree. Once you’ve created the tree, you might want to do some statistics. Generally, you’ll need a phylogeny that is both ultrametric and dichtomous to do these stats. Here’s one way to create this kind of tree in R.
Start by downloading a phylogenetic subset of your species at www.vertlife.org or www.birdtree.org. (Sorry invertebrate and plant folks–you’ll have to look elsewhere.) Make sure you have all of your species names correct–if they’re mismatched, you won’t get what you’re looking for and it doesn’t tell you. I usually go with a 1000-tree subset. You should save it as a .nex (nexus) file.
In R, you’ll need the following packages:
install.packages("ape") install.packages("phytools") install.packages("picante")
Once you’ve got them installed, you can call them at the beginning of each work session:
library(ape) library(phytools) library(picante)
Now read in your tree (here we’re using the read.nexus function from the APE package):
Next, you need to produce a single tree from the 1000 trees you downloaded from vertlife.org. To do this, we’ll use a package from phytools to build a majority-rules consensus. Learn more here: http://blog.phytools.org/2016/03/method-to-compute-consensus-edge.html
#Create a consensus tree. Note: p=0.5 specifies that the tree must be "majority rules consensus (MRC)". consensus<-consensus.edges(treefile, consensus.tree=consensus(treefile,p=0.5)) plotTree(consensus, fsize=0.6)
Congrats! You’ve made a phylogeny. There were some bugs for me, so I had to make a few modifications…. Your mileage may vary. This next part is only useful if your edge lengths are off.
print(consensus$edge.length) #how does it look? too many zeros at the end? There appears to be a bug. If so, you will need to run the code below.... #This makes a single-number vector that represents the length of the consensus$edge.length n <- length(consensus$edge.length) #need to reduce the number of elements by 1... We also add a very small amount to make sure no branch length is zero. consensus$edge.length <- consensus$edge.length[1:n-1] + .0000001 print(consensus$edge.length) #how does it look now? It should not have a zero for the last row. Plot to confirm it works. plotTree(consensus, fsize=0.6)
Great, now you need to convert your consensus tree into an ultrametric, dichotomous tree. PhylANOVA and PIC assume your tree will be both of these. What is “ultrametric?” Good question. It’s a tree where all of the path lengths from the root to the tips are equal. Dichotomy is another assumption of the statistical tests you may be interested in performing later.
First, we’ll make the tree ultrametric using the chronos function in APE.
consensus_ultra=chronos(consensus, lambda=0) # Lambda is the rate-smoothing parameter. Generally, we want lambda to be small, so there is very little smoothing See here (citation at bottom): http://www.justinbagley.org/1226/update-new-functions-for-generating-starting-trees-for-beast-or-starbeast-in-r #Check again; now it's ultrametric. is.ultrametric(consensus_ultra)
Now let’s make it dichotomous using multi2di from APE.
#Make the tree dichotomous (so each node splits into only two branches) tree.unmatched <- multi2di(consensus_ultra, random=TRUE) #Let's plot our new tree. Huzzah! plotTree(tree.unmatched,fsize=0.6)
Here’s an example of what it may look like:
Here’s the actual code with a few more comments as well as the sample nexus file for the tree above: