Wednesday, January 7, 2015

How to collapse unsupported branches of trees into multichotomies using Ape in R

To collapse the unsupported branches of a phylogenetic tree into polytomies sounds actually pretty basic, but I couldn't find a satisfactory way (or code) around, except for this. There the user brice.server makes a suggestion, but it doesn't show any code for it.

So in here I put my own attempt, which I think is rather clumsy, but oh well...

First, I got a bit familiar with a phylo object, by reading for example this. Pay special attention to their explanation of tree$edge. There goes the code:

library(ape) # Load the Ape library
setwd("Path/to/my/trees") # Add here your path to the tree file


I was working with the output of RAxML, which is a Newick tree. It can be read like this:
tree <- read.tree("My_RAxML_tree.tre")
Or if you're working with a nexus file, like for example from MrBayes, you can call it:
tree <- read.nexus("My_MrBayes_tree.con")[[1]] # Extract only the first tree

Remember that in MrBayes, to get your consensus tree, you have to type conformat=simple when using sumt. Otherwise, ape won't see the branch support values (although FigTree will). So do (for example):
MrBayes > sumt relburnin = yes burninfrac = 0.25 conformat=simple

Anyway, back to getting rid of the unsupported branches:
# ==== Prune branches that have low support ====
# Which nodes have low support?
Badnodes <- which(as.numeric(tree$node.label) < 70) + length(tree$tip.label)
# The number of those nodes are equal to the position in node.label plus the number of species (the first n nodes)
# The branch lengths INDEXES of each node are tree$edge[,1], and tree$edge[,2] are the nodes names
Notice that in here I considered unsupported a branch with bootstrap values that were less than 70. You could instead use 0.95 for posterior probabilities, for instance.
# The actual indexes of the nodes with low support in tree$edge
Badnodes_indexes <- c()
for(node in Badnodes){
  Badnodes_indexes <- c(Badnodes_indexes, which(tree$edge[,2] == node))
}
I'm sure there is a much smarter way of doing that last part than a for loop, but I was lazy to check (suggestions?)
tree$edge.length[Badnodes_indexes] <- 0 # Make the branch length of the bad nodes = 0
tree_multi <- di2multi(tree) #Use di2multi function to convert the branches with lenght = 0, to multichotomies
Finally, save your tree in a Newick file:
write.tree(tree_multi, file = "My_edited_tree.tre")

No comments:

Post a Comment